Overview

Brought to you by YData

Dataset statistics

Number of variables64
Number of observations601451
Missing cells14933142
Missing cells (%)38.8%
Total size in memory293.7 MiB
Average record size in memory512.0 B

Variable types

Text64

Dataset

DescriptionMammal NMNH Extant Specimen Records 0054884-241126133413365
URLhttps://doi.org/10.15468/dl.dys66y

Alerts

collectionID has constant value "urn:uuid:59e56a59-8615-4e0c-841d-eb88f3876b22" Constant
collectionCode has constant value "MAMM" Constant
datasetName has constant value "NMNH Extant Biology" Constant
kingdom has constant value "Animalia" Constant
phylum has constant value "Chordata" Constant
class has constant value "Mammalia" Constant
taxonRank has constant value "subspecies" Constant
recordNumber has 50821 (8.4%) missing values Missing
recordedBy has 55563 (9.2%) missing values Missing
lifeStage has 549447 (91.4%) missing values Missing
preparations has 26965 (4.5%) missing values Missing
associatedMedia has 45503 (7.6%) missing values Missing
associatedSequences has 600397 (99.8%) missing values Missing
occurrenceRemarks has 590662 (98.2%) missing values Missing
eventDate has 28127 (4.7%) missing values Missing
startDayOfYear has 46793 (7.8%) missing values Missing
endDayOfYear has 46765 (7.8%) missing values Missing
year has 28127 (4.7%) missing values Missing
month has 44866 (7.5%) missing values Missing
day has 67482 (11.2%) missing values Missing
verbatimEventDate has 36490 (6.1%) missing values Missing
habitat has 468915 (78.0%) missing values Missing
waterBody has 539858 (89.8%) missing values Missing
islandGroup has 596682 (99.2%) missing values Missing
island has 564842 (93.9%) missing values Missing
country has 6532 (1.1%) missing values Missing
stateProvince has 93954 (15.6%) missing values Missing
county has 447402 (74.4%) missing values Missing
locality has 35404 (5.9%) missing values Missing
minimumElevationInMeters has 496901 (82.6%) missing values Missing
maximumElevationInMeters has 597572 (99.4%) missing values Missing
verbatimElevation has 599861 (99.7%) missing values Missing
minimumDepthInMeters has 601448 (> 99.9%) missing values Missing
decimalLatitude has 448433 (74.6%) missing values Missing
decimalLongitude has 448433 (74.6%) missing values Missing
geodeticDatum has 594543 (98.9%) missing values Missing
verbatimLatitude has 466631 (77.6%) missing values Missing
verbatimLongitude has 466723 (77.6%) missing values Missing
verbatimCoordinateSystem has 468202 (77.8%) missing values Missing
georeferenceProtocol has 592196 (98.5%) missing values Missing
georeferenceRemarks has 601383 (> 99.9%) missing values Missing
identificationQualifier has 599947 (99.7%) missing values Missing
typeStatus has 597685 (99.4%) missing values Missing
identifiedBy has 593267 (98.6%) missing values Missing
subgenus has 601149 (99.9%) missing values Missing
infraspecificEpithet has 314922 (52.4%) missing values Missing
taxonRank has 314922 (52.4%) missing values Missing
scientificNameAuthorship has 555607 (92.4%) missing values Missing
gbifID has unique values Unique
occurrenceID has unique values Unique

Reproduction

Analysis started2025-01-14 16:48:32.631479
Analysis finished2025-01-14 16:48:48.352677
Duration15.72 seconds
Software versionydata-profiling vv4.12.1
Download configurationconfig.json

Variables

gbifID
Text

Unique 

Distinct601451
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Memory size4.6 MiB
2025-01-14T11:48:48.708452image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length10
Median length10
Mean length10
Min length10

Characters and Unicode

Total characters6014510
Distinct characters10
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique601451 ?
Unique (%)100.0%

Sample

1st row1322535732
2nd row1322538146
3rd row1317206206
4th row1317210025
5th row1317210456
ValueCountFrequency (%)
1322535732 1
 
< 0.1%
1322555094 1
 
< 0.1%
1322560018 1
 
< 0.1%
1322558352 1
 
< 0.1%
1317224532 1
 
< 0.1%
4041103536 1
 
< 0.1%
1317206206 1
 
< 0.1%
1317210025 1
 
< 0.1%
1317210456 1
 
< 0.1%
1317211504 1
 
< 0.1%
Other values (601441) 601441
> 99.9%
2025-01-14T11:48:49.172170image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
1 1342473
22.3%
3 953825
15.9%
2 772027
12.8%
8 469400
 
7.8%
9 463026
 
7.7%
0 459240
 
7.6%
7 444579
 
7.4%
4 377786
 
6.3%
5 367488
 
6.1%
6 364666
 
6.1%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 6014510
100.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
1 1342473
22.3%
3 953825
15.9%
2 772027
12.8%
8 469400
 
7.8%
9 463026
 
7.7%
0 459240
 
7.6%
7 444579
 
7.4%
4 377786
 
6.3%
5 367488
 
6.1%
6 364666
 
6.1%

Most occurring scripts

ValueCountFrequency (%)
Common 6014510
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
1 1342473
22.3%
3 953825
15.9%
2 772027
12.8%
8 469400
 
7.8%
9 463026
 
7.7%
0 459240
 
7.6%
7 444579
 
7.4%
4 377786
 
6.3%
5 367488
 
6.1%
6 364666
 
6.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 6014510
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
1 1342473
22.3%
3 953825
15.9%
2 772027
12.8%
8 469400
 
7.8%
9 463026
 
7.7%
0 459240
 
7.6%
7 444579
 
7.4%
4 377786
 
6.3%
5 367488
 
6.1%
6 364666
 
6.1%
Distinct29672
Distinct (%)4.9%
Missing0
Missing (%)0.0%
Memory size4.6 MiB
2025-01-14T11:48:49.384021image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length19
Median length19
Mean length19
Min length19

Characters and Unicode

Total characters11427569
Distinct characters13
Distinct categories4 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique12662 ?
Unique (%)2.1%

Sample

1st row2021-08-09 14:50:00
2nd row2020-04-09 11:54:00
3rd row2020-03-17 10:16:00
4th row2020-05-20 10:50:00
5th row2017-12-08 15:28:00
ValueCountFrequency (%)
2017-12-08 28553
 
2.4%
2021-01-15 25810
 
2.1%
2020-07-24 12948
 
1.1%
2020-04-09 11060
 
0.9%
2020-03-12 10837
 
0.9%
2020-04-13 9731
 
0.8%
2020-04-14 8525
 
0.7%
2020-04-06 8277
 
0.7%
2020-03-25 8028
 
0.7%
2020-04-02 7941
 
0.7%
Other values (2209) 1071192
89.1%
2025-01-14T11:48:49.650581image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
0 3217264
28.2%
2 1683114
14.7%
1 1412891
12.4%
- 1202902
 
10.5%
: 1202902
 
10.5%
601451
 
5.3%
4 455860
 
4.0%
3 439973
 
3.9%
5 428273
 
3.7%
9 215795
 
1.9%
Other values (3) 567144
 
5.0%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 8420314
73.7%
Dash Punctuation 1202902
 
10.5%
Other Punctuation 1202902
 
10.5%
Space Separator 601451
 
5.3%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
0 3217264
38.2%
2 1683114
20.0%
1 1412891
16.8%
4 455860
 
5.4%
3 439973
 
5.2%
5 428273
 
5.1%
9 215795
 
2.6%
6 207959
 
2.5%
7 187569
 
2.2%
8 171616
 
2.0%
Dash Punctuation
ValueCountFrequency (%)
- 1202902
100.0%
Other Punctuation
ValueCountFrequency (%)
: 1202902
100.0%
Space Separator
ValueCountFrequency (%)
601451
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 11427569
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
0 3217264
28.2%
2 1683114
14.7%
1 1412891
12.4%
- 1202902
 
10.5%
: 1202902
 
10.5%
601451
 
5.3%
4 455860
 
4.0%
3 439973
 
3.9%
5 428273
 
3.7%
9 215795
 
1.9%
Other values (3) 567144
 
5.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 11427569
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
0 3217264
28.2%
2 1683114
14.7%
1 1412891
12.4%
- 1202902
 
10.5%
: 1202902
 
10.5%
601451
 
5.3%
4 455860
 
4.0%
3 439973
 
3.9%
5 428273
 
3.7%
9 215795
 
1.9%
Other values (3) 567144
 
5.0%
Distinct50
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size4.6 MiB
2025-01-14T11:48:49.732102image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length29
Median length29
Mean length28.8108624
Min length2

Characters and Unicode

Total characters17328322
Distinct characters41
Distinct categories5 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique13 ?
Unique (%)< 0.1%

Sample

1st rowurn:lsid:biocol.org:col:34871
2nd rowurn:lsid:biocol.org:col:34871
3rd rowurn:lsid:biocol.org:col:34871
4th rowurn:lsid:biocol.org:col:34871
5th rowurn:lsid:biocol.org:col:34871
ValueCountFrequency (%)
urn:lsid:biocol.org:col:34871 596967
99.3%
nsmt 977
 
0.2%
uam 775
 
0.1%
nrm 386
 
0.1%
rmnh 354
 
0.1%
rcs 246
 
< 0.1%
nmv 238
 
< 0.1%
nmsz 188
 
< 0.1%
zmmu 179
 
< 0.1%
fcmm 127
 
< 0.1%
Other values (40) 1015
 
0.2%
2025-01-14T11:48:49.853640image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
o 2387868
13.8%
: 2387868
13.8%
l 1790901
 
10.3%
i 1193934
 
6.9%
r 1193934
 
6.9%
c 1193934
 
6.9%
g 596967
 
3.4%
7 596967
 
3.4%
8 596967
 
3.4%
4 596967
 
3.4%
Other values (31) 4792015
27.7%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 11342373
65.5%
Other Punctuation 2984837
 
17.2%
Decimal Number 2984835
 
17.2%
Uppercase Letter 16276
 
0.1%
Space Separator 1
 
< 0.1%

Most frequent character per category

Uppercase Letter
ValueCountFrequency (%)
M 4384
26.9%
N 2583
15.9%
S 1796
11.0%
A 1319
 
8.1%
U 1175
 
7.2%
R 1035
 
6.4%
T 978
 
6.0%
C 551
 
3.4%
H 550
 
3.4%
Z 467
 
2.9%
Other values (11) 1438
 
8.8%
Lowercase Letter
ValueCountFrequency (%)
o 2387868
21.1%
l 1790901
15.8%
i 1193934
10.5%
r 1193934
10.5%
c 1193934
10.5%
g 596967
 
5.3%
u 596967
 
5.3%
b 596967
 
5.3%
d 596967
 
5.3%
s 596967
 
5.3%
Decimal Number
ValueCountFrequency (%)
7 596967
20.0%
8 596967
20.0%
4 596967
20.0%
3 596967
20.0%
1 596967
20.0%
Other Punctuation
ValueCountFrequency (%)
: 2387868
80.0%
. 596967
 
20.0%
? 2
 
< 0.1%
Space Separator
ValueCountFrequency (%)
1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 11358649
65.5%
Common 5969673
34.5%

Most frequent character per script

Latin
ValueCountFrequency (%)
o 2387868
21.0%
l 1790901
15.8%
i 1193934
10.5%
r 1193934
10.5%
c 1193934
10.5%
g 596967
 
5.3%
u 596967
 
5.3%
b 596967
 
5.3%
d 596967
 
5.3%
s 596967
 
5.3%
Other values (22) 613243
 
5.4%
Common
ValueCountFrequency (%)
: 2387868
40.0%
7 596967
 
10.0%
8 596967
 
10.0%
4 596967
 
10.0%
3 596967
 
10.0%
. 596967
 
10.0%
1 596967
 
10.0%
? 2
 
< 0.1%
1
 
< 0.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 17328322
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
o 2387868
13.8%
: 2387868
13.8%
l 1790901
 
10.3%
i 1193934
 
6.9%
r 1193934
 
6.9%
c 1193934
 
6.9%
g 596967
 
3.4%
7 596967
 
3.4%
8 596967
 
3.4%
4 596967
 
3.4%
Other values (31) 4792015
27.7%

collectionID
Text

Constant 

Distinct1
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size4.6 MiB
2025-01-14T11:48:49.910901image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length45
Median length45
Mean length45
Min length45

Characters and Unicode

Total characters27065295
Distinct characters22
Distinct categories4 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowurn:uuid:59e56a59-8615-4e0c-841d-eb88f3876b22
2nd rowurn:uuid:59e56a59-8615-4e0c-841d-eb88f3876b22
3rd rowurn:uuid:59e56a59-8615-4e0c-841d-eb88f3876b22
4th rowurn:uuid:59e56a59-8615-4e0c-841d-eb88f3876b22
5th rowurn:uuid:59e56a59-8615-4e0c-841d-eb88f3876b22
ValueCountFrequency (%)
urn:uuid:59e56a59-8615-4e0c-841d-eb88f3876b22 601451
100.0%
2025-01-14T11:48:50.022091image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
8 3007255
 
11.1%
- 2405804
 
8.9%
5 2405804
 
8.9%
6 1804353
 
6.7%
e 1804353
 
6.7%
u 1804353
 
6.7%
d 1202902
 
4.4%
9 1202902
 
4.4%
: 1202902
 
4.4%
1 1202902
 
4.4%
Other values (12) 9021765
33.3%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 13833373
51.1%
Lowercase Letter 9623216
35.6%
Dash Punctuation 2405804
 
8.9%
Other Punctuation 1202902
 
4.4%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
8 3007255
21.7%
5 2405804
17.4%
6 1804353
13.0%
9 1202902
 
8.7%
1 1202902
 
8.7%
4 1202902
 
8.7%
2 1202902
 
8.7%
0 601451
 
4.3%
3 601451
 
4.3%
7 601451
 
4.3%
Lowercase Letter
ValueCountFrequency (%)
e 1804353
18.8%
u 1804353
18.8%
d 1202902
12.5%
b 1202902
12.5%
i 601451
 
6.2%
a 601451
 
6.2%
r 601451
 
6.2%
n 601451
 
6.2%
c 601451
 
6.2%
f 601451
 
6.2%
Dash Punctuation
ValueCountFrequency (%)
- 2405804
100.0%
Other Punctuation
ValueCountFrequency (%)
: 1202902
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 17442079
64.4%
Latin 9623216
35.6%

Most frequent character per script

Common
ValueCountFrequency (%)
8 3007255
17.2%
- 2405804
13.8%
5 2405804
13.8%
6 1804353
10.3%
9 1202902
 
6.9%
: 1202902
 
6.9%
1 1202902
 
6.9%
4 1202902
 
6.9%
2 1202902
 
6.9%
0 601451
 
3.4%
Other values (2) 1202902
 
6.9%
Latin
ValueCountFrequency (%)
e 1804353
18.8%
u 1804353
18.8%
d 1202902
12.5%
b 1202902
12.5%
i 601451
 
6.2%
a 601451
 
6.2%
r 601451
 
6.2%
n 601451
 
6.2%
c 601451
 
6.2%
f 601451
 
6.2%

Most occurring blocks

ValueCountFrequency (%)
ASCII 27065295
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
8 3007255
 
11.1%
- 2405804
 
8.9%
5 2405804
 
8.9%
6 1804353
 
6.7%
e 1804353
 
6.7%
u 1804353
 
6.7%
d 1202902
 
4.4%
9 1202902
 
4.4%
: 1202902
 
4.4%
1 1202902
 
4.4%
Other values (12) 9021765
33.3%
Distinct50
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size4.6 MiB
2025-01-14T11:48:50.083464image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length7
Median length4
Mean length3.997244996
Min length2

Characters and Unicode

Total characters2404147
Distinct characters23
Distinct categories3 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique13 ?
Unique (%)< 0.1%

Sample

1st rowUSNM
2nd rowUSNM
3rd rowUSNM
4th rowUSNM
5th rowUSNM
ValueCountFrequency (%)
usnm 596967
99.3%
nsmt 977
 
0.2%
uam 775
 
0.1%
nrm 386
 
0.1%
rmnh 354
 
0.1%
rcs 246
 
< 0.1%
nmv 238
 
< 0.1%
nmsz 188
 
< 0.1%
zmmu 179
 
< 0.1%
fcmm 127
 
< 0.1%
Other values (40) 1015
 
0.2%
2025-01-14T11:48:50.202483image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
M 601351
25.0%
N 599550
24.9%
S 598763
24.9%
U 598142
24.9%
A 1319
 
0.1%
R 1035
 
< 0.1%
T 978
 
< 0.1%
C 551
 
< 0.1%
H 550
 
< 0.1%
Z 467
 
< 0.1%
Other values (13) 1441
 
0.1%

Most occurring categories

ValueCountFrequency (%)
Uppercase Letter 2404144
> 99.9%
Other Punctuation 2
 
< 0.1%
Space Separator 1
 
< 0.1%

Most frequent character per category

Uppercase Letter
ValueCountFrequency (%)
M 601351
25.0%
N 599550
24.9%
S 598763
24.9%
U 598142
24.9%
A 1319
 
0.1%
R 1035
 
< 0.1%
T 978
 
< 0.1%
C 551
 
< 0.1%
H 550
 
< 0.1%
Z 467
 
< 0.1%
Other values (11) 1438
 
0.1%
Other Punctuation
ValueCountFrequency (%)
? 2
100.0%
Space Separator
ValueCountFrequency (%)
1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 2404144
> 99.9%
Common 3
 
< 0.1%

Most frequent character per script

Latin
ValueCountFrequency (%)
M 601351
25.0%
N 599550
24.9%
S 598763
24.9%
U 598142
24.9%
A 1319
 
0.1%
R 1035
 
< 0.1%
T 978
 
< 0.1%
C 551
 
< 0.1%
H 550
 
< 0.1%
Z 467
 
< 0.1%
Other values (11) 1438
 
0.1%
Common
ValueCountFrequency (%)
? 2
66.7%
1
33.3%

Most occurring blocks

ValueCountFrequency (%)
ASCII 2404147
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
M 601351
25.0%
N 599550
24.9%
S 598763
24.9%
U 598142
24.9%
A 1319
 
0.1%
R 1035
 
< 0.1%
T 978
 
< 0.1%
C 551
 
< 0.1%
H 550
 
< 0.1%
Z 467
 
< 0.1%
Other values (13) 1441
 
0.1%

collectionCode
Text

Constant 

Distinct1
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size4.6 MiB
2025-01-14T11:48:50.249297image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length4
Median length4
Mean length4
Min length4

Characters and Unicode

Total characters2405804
Distinct characters2
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowMAMM
2nd rowMAMM
3rd rowMAMM
4th rowMAMM
5th rowMAMM
ValueCountFrequency (%)
mamm 601451
100.0%
2025-01-14T11:48:50.344928image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
M 1804353
75.0%
A 601451
 
25.0%

Most occurring categories

ValueCountFrequency (%)
Uppercase Letter 2405804
100.0%

Most frequent character per category

Uppercase Letter
ValueCountFrequency (%)
M 1804353
75.0%
A 601451
 
25.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 2405804
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
M 1804353
75.0%
A 601451
 
25.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 2405804
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
M 1804353
75.0%
A 601451
 
25.0%

datasetName
Text

Constant 

Distinct1
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size4.6 MiB
2025-01-14T11:48:50.388752image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length19
Median length19
Mean length19
Min length19

Characters and Unicode

Total characters11427569
Distinct characters15
Distinct categories3 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowNMNH Extant Biology
2nd rowNMNH Extant Biology
3rd rowNMNH Extant Biology
4th rowNMNH Extant Biology
5th rowNMNH Extant Biology
ValueCountFrequency (%)
nmnh 601451
33.3%
extant 601451
33.3%
biology 601451
33.3%
2025-01-14T11:48:50.489395image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
N 1202902
 
10.5%
1202902
 
10.5%
t 1202902
 
10.5%
o 1202902
 
10.5%
M 601451
 
5.3%
H 601451
 
5.3%
E 601451
 
5.3%
x 601451
 
5.3%
a 601451
 
5.3%
n 601451
 
5.3%
Other values (5) 3007255
26.3%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 6615961
57.9%
Uppercase Letter 3608706
31.6%
Space Separator 1202902
 
10.5%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
t 1202902
18.2%
o 1202902
18.2%
x 601451
9.1%
a 601451
9.1%
n 601451
9.1%
i 601451
9.1%
l 601451
9.1%
g 601451
9.1%
y 601451
9.1%
Uppercase Letter
ValueCountFrequency (%)
N 1202902
33.3%
M 601451
16.7%
H 601451
16.7%
E 601451
16.7%
B 601451
16.7%
Space Separator
ValueCountFrequency (%)
1202902
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 10224667
89.5%
Common 1202902
 
10.5%

Most frequent character per script

Latin
ValueCountFrequency (%)
N 1202902
11.8%
t 1202902
11.8%
o 1202902
11.8%
M 601451
 
5.9%
H 601451
 
5.9%
E 601451
 
5.9%
x 601451
 
5.9%
a 601451
 
5.9%
n 601451
 
5.9%
B 601451
 
5.9%
Other values (4) 2405804
23.5%
Common
ValueCountFrequency (%)
1202902
100.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 11427569
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
N 1202902
 
10.5%
1202902
 
10.5%
t 1202902
 
10.5%
o 1202902
 
10.5%
M 601451
 
5.3%
H 601451
 
5.3%
E 601451
 
5.3%
x 601451
 
5.3%
a 601451
 
5.3%
n 601451
 
5.3%
Other values (5) 3007255
26.3%
Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size4.6 MiB
2025-01-14T11:48:50.542104image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length17
Median length17
Mean length16.95205428
Min length16

Characters and Unicode

Total characters10195830
Distinct characters19
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowPreservedSpecimen
2nd rowPreservedSpecimen
3rd rowPreservedSpecimen
4th rowPreservedSpecimen
5th rowHumanObservation
ValueCountFrequency (%)
preservedspecimen 572614
95.2%
humanobservation 28837
 
4.8%
2025-01-14T11:48:50.648927image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
e 2891907
28.4%
r 1174065
11.5%
n 630288
 
6.2%
i 601451
 
5.9%
s 601451
 
5.9%
v 601451
 
5.9%
m 601451
 
5.9%
c 572614
 
5.6%
P 572614
 
5.6%
p 572614
 
5.6%
Other values (9) 1375924
13.5%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 8992928
88.2%
Uppercase Letter 1202902
 
11.8%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
e 2891907
32.2%
r 1174065
13.1%
n 630288
 
7.0%
i 601451
 
6.7%
s 601451
 
6.7%
v 601451
 
6.7%
m 601451
 
6.7%
c 572614
 
6.4%
p 572614
 
6.4%
d 572614
 
6.4%
Other values (5) 173022
 
1.9%
Uppercase Letter
ValueCountFrequency (%)
P 572614
47.6%
S 572614
47.6%
H 28837
 
2.4%
O 28837
 
2.4%

Most occurring scripts

ValueCountFrequency (%)
Latin 10195830
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
e 2891907
28.4%
r 1174065
11.5%
n 630288
 
6.2%
i 601451
 
5.9%
s 601451
 
5.9%
v 601451
 
5.9%
m 601451
 
5.9%
c 572614
 
5.6%
P 572614
 
5.6%
p 572614
 
5.6%
Other values (9) 1375924
13.5%

Most occurring blocks

ValueCountFrequency (%)
ASCII 10195830
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
e 2891907
28.4%
r 1174065
11.5%
n 630288
 
6.2%
i 601451
 
5.9%
s 601451
 
5.9%
v 601451
 
5.9%
m 601451
 
5.9%
c 572614
 
5.6%
P 572614
 
5.6%
p 572614
 
5.6%
Other values (9) 1375924
13.5%

occurrenceID
Text

Unique 

Distinct601451
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Memory size4.6 MiB
2025-01-14T11:48:50.971218image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length63
Median length63
Mean length63
Min length63

Characters and Unicode

Total characters37891413
Distinct characters26
Distinct categories4 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique601451 ?
Unique (%)100.0%

Sample

1st rowhttp://n2t.net/ark:/65665/3ebec6a7f-5e95-4543-b061-6d73d80dd2ee
2nd rowhttp://n2t.net/ark:/65665/3ec070d5d-1893-4600-afa5-e56695ff219b
3rd rowhttp://n2t.net/ark:/65665/3002acaf9-9788-4539-8883-fe6bfd5f8d88
4th rowhttp://n2t.net/ark:/65665/300553499-1544-460e-9507-55ada241f992
5th rowhttp://n2t.net/ark:/65665/3005a3503-9c20-443c-899a-559e550dc71e
ValueCountFrequency (%)
http://n2t.net/ark:/65665/3ebec6a7f-5e95-4543-b061-6d73d80dd2ee 1
 
< 0.1%
http://n2t.net/ark:/65665/3ecc76d35-e5c5-434e-874b-88c5d85dbb91 1
 
< 0.1%
http://n2t.net/ark:/65665/3ecff6276-27d1-4ad7-aac3-32c485b9bed6 1
 
< 0.1%
http://n2t.net/ark:/65665/3eceb4d85-2fbe-4bf2-aef7-b3393445f319 1
 
< 0.1%
http://n2t.net/ark:/65665/300f96572-4f6d-48dc-9b78-1ba0e03bb0ae 1
 
< 0.1%
http://n2t.net/ark:/65665/3ec5d68e1-4786-40d2-9bdb-bb8ef2ad056d 1
 
< 0.1%
http://n2t.net/ark:/65665/3002acaf9-9788-4539-8883-fe6bfd5f8d88 1
 
< 0.1%
http://n2t.net/ark:/65665/300553499-1544-460e-9507-55ada241f992 1
 
< 0.1%
http://n2t.net/ark:/65665/3005a3503-9c20-443c-899a-559e550dc71e 1
 
< 0.1%
http://n2t.net/ark:/65665/300664e6c-5334-4a8e-b9a7-4d84389595e0 1
 
< 0.1%
Other values (601441) 601441
> 99.9%
2025-01-14T11:48:51.369957image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
/ 3007255
 
7.9%
6 2930823
 
7.7%
- 2405804
 
6.3%
t 2405804
 
6.3%
5 2330760
 
6.2%
a 1878835
 
5.0%
e 1729856
 
4.6%
2 1729289
 
4.6%
3 1728046
 
4.6%
4 1727823
 
4.6%
Other values (16) 16017118
42.3%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 16387822
43.2%
Lowercase Letter 14286179
37.7%
Other Punctuation 4811608
 
12.7%
Dash Punctuation 2405804
 
6.3%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
t 2405804
16.8%
a 1878835
13.2%
e 1729856
12.1%
b 1278851
9.0%
n 1202902
8.4%
f 1128774
7.9%
c 1128212
7.9%
d 1127141
7.9%
k 601451
 
4.2%
r 601451
 
4.2%
Other values (2) 1202902
8.4%
Decimal Number
ValueCountFrequency (%)
6 2930823
17.9%
5 2330760
14.2%
2 1729289
10.6%
3 1728046
10.5%
4 1727823
10.5%
9 1279292
7.8%
8 1278534
7.8%
0 1129193
 
6.9%
7 1127612
 
6.9%
1 1126450
 
6.9%
Other Punctuation
ValueCountFrequency (%)
/ 3007255
62.5%
: 1202902
 
25.0%
. 601451
 
12.5%
Dash Punctuation
ValueCountFrequency (%)
- 2405804
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 23605234
62.3%
Latin 14286179
37.7%

Most frequent character per script

Common
ValueCountFrequency (%)
/ 3007255
12.7%
6 2930823
12.4%
- 2405804
10.2%
5 2330760
9.9%
2 1729289
7.3%
3 1728046
7.3%
4 1727823
7.3%
9 1279292
 
5.4%
8 1278534
 
5.4%
: 1202902
 
5.1%
Other values (4) 3984706
16.9%
Latin
ValueCountFrequency (%)
t 2405804
16.8%
a 1878835
13.2%
e 1729856
12.1%
b 1278851
9.0%
n 1202902
8.4%
f 1128774
7.9%
c 1128212
7.9%
d 1127141
7.9%
k 601451
 
4.2%
r 601451
 
4.2%
Other values (2) 1202902
8.4%

Most occurring blocks

ValueCountFrequency (%)
ASCII 37891413
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
/ 3007255
 
7.9%
6 2930823
 
7.7%
- 2405804
 
6.3%
t 2405804
 
6.3%
5 2330760
 
6.2%
a 1878835
 
5.0%
e 1729856
 
4.6%
2 1729289
 
4.6%
3 1728046
 
4.6%
4 1727823
 
4.6%
Other values (16) 16017118
42.3%
Distinct601428
Distinct (%)> 99.9%
Missing0
Missing (%)0.0%
Memory size4.6 MiB
2025-01-14T11:48:51.788653image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length16
Median length11
Mean length10.92069179
Min length4

Characters and Unicode

Total characters6568261
Distinct characters35
Distinct categories4 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique601407 ?
Unique (%)> 99.9%

Sample

1st rowUSNM 449558
2nd rowUSNM 226903
3rd rowUSNM 386480
4th rowUSNM 68620
5th rowUSNM MME9342
ValueCountFrequency (%)
usnm 596967
49.8%
wam 63
 
< 0.1%
mb 40
 
< 0.1%
zin 21
 
< 0.1%
lacm 18
 
< 0.1%
nsmt 12
 
< 0.1%
sama 6
 
< 0.1%
zmmu 5
 
< 0.1%
rmnh 4
 
< 0.1%
ncsm 4
 
< 0.1%
Other values (601439) 601471
50.2%
2025-01-14T11:48:52.267485image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
M 627122
9.5%
S 616877
 
9.4%
N 601401
 
9.2%
U 598144
 
9.1%
597160
 
9.1%
1 405808
 
6.2%
2 403390
 
6.1%
3 394478
 
6.0%
5 393693
 
6.0%
4 379861
 
5.8%
Other values (25) 1550327
23.6%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 3465081
52.8%
Uppercase Letter 2506018
38.2%
Space Separator 597160
 
9.1%
Other Punctuation 2
 
< 0.1%

Most frequent character per category

Uppercase Letter
ValueCountFrequency (%)
M 627122
25.0%
S 616877
24.6%
N 601401
24.0%
U 598144
23.9%
R 17298
 
0.7%
T 17251
 
0.7%
E 14721
 
0.6%
A 10176
 
0.4%
C 553
 
< 0.1%
H 550
 
< 0.1%
Other values (13) 1925
 
0.1%
Decimal Number
ValueCountFrequency (%)
1 405808
11.7%
2 403390
11.6%
3 394478
11.4%
5 393693
11.4%
4 379861
11.0%
6 309193
8.9%
7 297996
8.6%
0 295420
8.5%
8 295286
8.5%
9 289956
8.4%
Space Separator
ValueCountFrequency (%)
597160
100.0%
Other Punctuation
ValueCountFrequency (%)
? 2
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 4062243
61.8%
Latin 2506018
38.2%

Most frequent character per script

Latin
ValueCountFrequency (%)
M 627122
25.0%
S 616877
24.6%
N 601401
24.0%
U 598144
23.9%
R 17298
 
0.7%
T 17251
 
0.7%
E 14721
 
0.6%
A 10176
 
0.4%
C 553
 
< 0.1%
H 550
 
< 0.1%
Other values (13) 1925
 
0.1%
Common
ValueCountFrequency (%)
597160
14.7%
1 405808
10.0%
2 403390
9.9%
3 394478
9.7%
5 393693
9.7%
4 379861
9.4%
6 309193
7.6%
7 297996
7.3%
0 295420
7.3%
8 295286
7.3%
Other values (2) 289958
7.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 6568261
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
M 627122
9.5%
S 616877
 
9.4%
N 601401
 
9.2%
U 598144
 
9.1%
597160
 
9.1%
1 405808
 
6.2%
2 403390
 
6.1%
3 394478
 
6.0%
5 393693
 
6.0%
4 379861
 
5.8%
Other values (25) 1550327
23.6%

recordNumber
Text

Missing 

Distinct172937
Distinct (%)31.4%
Missing50821
Missing (%)8.4%
Memory size4.6 MiB
2025-01-14T11:48:52.482209image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length35
Median length28
Mean length5.176632221
Min length1

Characters and Unicode

Total characters2850409
Distinct characters79
Distinct categories11 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique147848 ?
Unique (%)26.9%

Sample

1st rowFMG 2371
2nd row142/19534X
3rd row07960
4th row6459
5th rowB47586/R50468
ValueCountFrequency (%)
no 47434
 
6.9%
number 47222
 
6.9%
cohjr 5988
 
0.9%
nzp 3372
 
0.5%
psc 2713
 
0.4%
jwk 2021
 
0.3%
r 1947
 
0.3%
fm 1793
 
0.3%
jjg 1781
 
0.3%
rem 1569
 
0.2%
Other values (105383) 570874
83.1%
2025-01-14T11:48:52.784200image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
1 307242
 
10.8%
2 246234
 
8.6%
3 208467
 
7.3%
4 190900
 
6.7%
0 182605
 
6.4%
5 181877
 
6.4%
6 173588
 
6.1%
7 165796
 
5.8%
8 159989
 
5.6%
9 153227
 
5.4%
Other values (69) 880484
30.9%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 1969925
69.1%
Uppercase Letter 409557
 
14.4%
Lowercase Letter 285569
 
10.0%
Space Separator 136084
 
4.8%
Other Punctuation 26739
 
0.9%
Dash Punctuation 20734
 
0.7%
Close Punctuation 888
 
< 0.1%
Open Punctuation 886
 
< 0.1%
Currency Symbol 13
 
< 0.1%
Math Symbol 10
 
< 0.1%

Most frequent character per category

Uppercase Letter
ValueCountFrequency (%)
N 106292
26.0%
R 28947
 
7.1%
M 24702
 
6.0%
J 23837
 
5.8%
C 21743
 
5.3%
H 19696
 
4.8%
X 17857
 
4.4%
B 15635
 
3.8%
P 15412
 
3.8%
E 14048
 
3.4%
Other values (16) 121388
29.6%
Lowercase Letter
ValueCountFrequency (%)
r 47347
16.6%
e 47325
16.6%
o 47216
16.5%
m 47180
16.5%
u 47177
16.5%
b 47174
16.5%
n 1310
 
0.5%
a 152
 
0.1%
p 115
 
< 0.1%
i 108
 
< 0.1%
Other values (13) 465
 
0.2%
Decimal Number
ValueCountFrequency (%)
1 307242
15.6%
2 246234
12.5%
3 208467
10.6%
4 190900
9.7%
0 182605
9.3%
5 181877
9.2%
6 173588
8.8%
7 165796
8.4%
8 159989
8.1%
9 153227
7.8%
Other Punctuation
ValueCountFrequency (%)
/ 23475
87.8%
. 2050
 
7.7%
, 626
 
2.3%
# 248
 
0.9%
? 202
 
0.8%
& 47
 
0.2%
; 44
 
0.2%
: 22
 
0.1%
* 21
 
0.1%
' 4
 
< 0.1%
Close Punctuation
ValueCountFrequency (%)
) 887
99.9%
] 1
 
0.1%
Open Punctuation
ValueCountFrequency (%)
( 885
99.9%
[ 1
 
0.1%
Math Symbol
ValueCountFrequency (%)
= 6
60.0%
+ 4
40.0%
Space Separator
ValueCountFrequency (%)
136084
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 20734
100.0%
Currency Symbol
ValueCountFrequency (%)
¢ 13
100.0%
Connector Punctuation
ValueCountFrequency (%)
_ 4
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 2155283
75.6%
Latin 695126
 
24.4%

Most frequent character per script

Latin
ValueCountFrequency (%)
N 106292
15.3%
r 47347
 
6.8%
e 47325
 
6.8%
o 47216
 
6.8%
m 47180
 
6.8%
u 47177
 
6.8%
b 47174
 
6.8%
R 28947
 
4.2%
M 24702
 
3.6%
J 23837
 
3.4%
Other values (39) 227929
32.8%
Common
ValueCountFrequency (%)
1 307242
14.3%
2 246234
11.4%
3 208467
9.7%
4 190900
8.9%
0 182605
8.5%
5 181877
8.4%
6 173588
8.1%
7 165796
7.7%
8 159989
7.4%
9 153227
7.1%
Other values (20) 185358
8.6%

Most occurring blocks

ValueCountFrequency (%)
ASCII 2850396
> 99.9%
None 13
 
< 0.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
1 307242
 
10.8%
2 246234
 
8.6%
3 208467
 
7.3%
4 190900
 
6.7%
0 182605
 
6.4%
5 181877
 
6.4%
6 173588
 
6.1%
7 165796
 
5.8%
8 159989
 
5.6%
9 153227
 
5.4%
Other values (68) 880471
30.9%
None
ValueCountFrequency (%)
¢ 13
100.0%

recordedBy
Text

Missing 

Distinct17644
Distinct (%)3.2%
Missing55563
Missing (%)9.2%
Memory size4.6 MiB
2025-01-14T11:48:52.982912image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length124
Median length114
Mean length11.92282483
Min length1

Characters and Unicode

Total characters6508527
Distinct characters80
Distinct categories9 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique9079 ?
Unique (%)1.7%

Sample

1st rowF. Greenwell
2nd rowJ. Silver
3rd rowSmithsonian Venezuelan Project
4th rowNelson & E. Goldman
5th rowW. Bowen & V. Thayer
ValueCountFrequency (%)
j 60783
 
4.7%
e 54366
 
4.2%
c 53496
 
4.2%
50457
 
3.9%
r 49868
 
3.9%
a 44074
 
3.4%
w 37880
 
2.9%
h 30720
 
2.4%
d 24753
 
1.9%
m 23831
 
1.9%
Other values (10447) 856734
66.6%
2025-01-14T11:48:53.257503image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
741074
 
11.4%
e 563544
 
8.7%
. 539103
 
8.3%
n 389678
 
6.0%
a 341353
 
5.2%
o 335107
 
5.1%
r 327053
 
5.0%
l 295446
 
4.5%
i 245022
 
3.8%
s 228632
 
3.5%
Other values (70) 2502515
38.4%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 3897970
59.9%
Uppercase Letter 1254996
 
19.3%
Space Separator 741074
 
11.4%
Other Punctuation 599060
 
9.2%
Close Punctuation 5447
 
0.1%
Open Punctuation 5376
 
0.1%
Dash Punctuation 2452
 
< 0.1%
Decimal Number 2151
 
< 0.1%
Math Symbol 1
 
< 0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
e 563544
14.5%
n 389678
10.0%
a 341353
8.8%
o 335107
8.6%
r 327053
 
8.4%
l 295446
 
7.6%
i 245022
 
6.3%
s 228632
 
5.9%
t 223935
 
5.7%
h 116266
 
3.0%
Other values (18) 831934
21.3%
Uppercase Letter
ValueCountFrequency (%)
R 91216
 
7.3%
M 88625
 
7.1%
C 87417
 
7.0%
S 86724
 
6.9%
H 84189
 
6.7%
G 82831
 
6.6%
J 76177
 
6.1%
A 70972
 
5.7%
E 64988
 
5.2%
P 62861
 
5.0%
Other values (16) 458996
36.6%
Other Punctuation
ValueCountFrequency (%)
. 539103
90.0%
& 50656
 
8.5%
, 8029
 
1.3%
' 1002
 
0.2%
/ 114
 
< 0.1%
: 78
 
< 0.1%
? 29
 
< 0.1%
" 26
 
< 0.1%
; 13
 
< 0.1%
# 10
 
< 0.1%
Decimal Number
ValueCountFrequency (%)
1 1561
72.6%
8 243
 
11.3%
2 219
 
10.2%
4 34
 
1.6%
6 33
 
1.5%
0 31
 
1.4%
9 12
 
0.6%
5 8
 
0.4%
3 7
 
0.3%
7 3
 
0.1%
Open Punctuation
ValueCountFrequency (%)
( 5375
> 99.9%
[ 1
 
< 0.1%
Space Separator
ValueCountFrequency (%)
741074
100.0%
Close Punctuation
ValueCountFrequency (%)
) 5447
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 2452
100.0%
Math Symbol
ValueCountFrequency (%)
+ 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 5152966
79.2%
Common 1355561
 
20.8%

Most frequent character per script

Latin
ValueCountFrequency (%)
e 563544
 
10.9%
n 389678
 
7.6%
a 341353
 
6.6%
o 335107
 
6.5%
r 327053
 
6.3%
l 295446
 
5.7%
i 245022
 
4.8%
s 228632
 
4.4%
t 223935
 
4.3%
h 116266
 
2.3%
Other values (44) 2086930
40.5%
Common
ValueCountFrequency (%)
741074
54.7%
. 539103
39.8%
& 50656
 
3.7%
, 8029
 
0.6%
) 5447
 
0.4%
( 5375
 
0.4%
- 2452
 
0.2%
1 1561
 
0.1%
' 1002
 
0.1%
8 243
 
< 0.1%
Other values (16) 619
 
< 0.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 6508521
> 99.9%
None 6
 
< 0.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
741074
 
11.4%
e 563544
 
8.7%
. 539103
 
8.3%
n 389678
 
6.0%
a 341353
 
5.2%
o 335107
 
5.1%
r 327053
 
5.0%
l 295446
 
4.5%
i 245022
 
3.8%
s 228632
 
3.5%
Other values (68) 2502509
38.4%
None
ValueCountFrequency (%)
ç 3
50.0%
ā 3
50.0%
Distinct21
Distinct (%)< 0.1%
Missing44
Missing (%)< 0.1%
Memory size4.6 MiB
2025-01-14T11:48:53.324147image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length4
Median length1
Mean length1.000033255
Min length1

Characters and Unicode

Total characters601427
Distinct characters10
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique11 ?
Unique (%)< 0.1%

Sample

1st row1
2nd row1
3rd row1
4th row1
5th row1
ValueCountFrequency (%)
1 601314
> 99.9%
2 45
 
< 0.1%
6 8
 
< 0.1%
3 8
 
< 0.1%
4 6
 
< 0.1%
7 5
 
< 0.1%
5 4
 
< 0.1%
271 2
 
< 0.1%
11 2
 
< 0.1%
20 2
 
< 0.1%
Other values (11) 11
 
< 0.1%
2025-01-14T11:48:53.441426image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
1 601326
> 99.9%
2 51
 
< 0.1%
6 9
 
< 0.1%
3 9
 
< 0.1%
4 9
 
< 0.1%
7 8
 
< 0.1%
0 7
 
< 0.1%
5 6
 
< 0.1%
9 1
 
< 0.1%
8 1
 
< 0.1%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 601427
100.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
1 601326
> 99.9%
2 51
 
< 0.1%
6 9
 
< 0.1%
3 9
 
< 0.1%
4 9
 
< 0.1%
7 8
 
< 0.1%
0 7
 
< 0.1%
5 6
 
< 0.1%
9 1
 
< 0.1%
8 1
 
< 0.1%

Most occurring scripts

ValueCountFrequency (%)
Common 601427
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
1 601326
> 99.9%
2 51
 
< 0.1%
6 9
 
< 0.1%
3 9
 
< 0.1%
4 9
 
< 0.1%
7 8
 
< 0.1%
0 7
 
< 0.1%
5 6
 
< 0.1%
9 1
 
< 0.1%
8 1
 
< 0.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 601427
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
1 601326
> 99.9%
2 51
 
< 0.1%
6 9
 
< 0.1%
3 9
 
< 0.1%
4 9
 
< 0.1%
7 8
 
< 0.1%
0 7
 
< 0.1%
5 6
 
< 0.1%
9 1
 
< 0.1%
8 1
 
< 0.1%

sex
Text

Distinct23
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size4.6 MiB
2025-01-14T11:48:53.496520image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length29
Median length21
Mean length5.271076114
Min length1

Characters and Unicode

Total characters3170294
Distinct characters27
Distinct categories4 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique9 ?
Unique (%)< 0.1%

Sample

1st rowMale
2nd rowMale
3rd rowMale
4th rowFemale
5th rowFemale
ValueCountFrequency (%)
male 266476
44.2%
female 246781
41.0%
unknown 87925
 
14.6%
multiple 279
 
< 0.1%
animals 279
 
< 0.1%
of 279
 
< 0.1%
mixed 279
 
< 0.1%
sex 279
 
< 0.1%
12
 
< 0.1%
f 5
 
< 0.1%
Other values (6) 10
 
< 0.1%
2025-01-14T11:48:53.614193image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
e 760879
24.0%
l 514098
16.2%
a 513820
16.2%
M 266760
 
8.4%
n 264058
 
8.3%
m 247341
 
7.8%
F 246786
 
7.8%
o 88205
 
2.8%
w 87927
 
2.8%
U 87926
 
2.8%
Other values (17) 92494
 
2.9%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 2567611
81.0%
Uppercase Letter 601473
 
19.0%
Space Separator 1153
 
< 0.1%
Other Punctuation 57
 
< 0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
e 760879
29.6%
l 514098
20.0%
a 513820
20.0%
n 264058
 
10.3%
m 247341
 
9.6%
o 88205
 
3.4%
w 87927
 
3.4%
k 87926
 
3.4%
i 839
 
< 0.1%
s 558
 
< 0.1%
Other values (9) 1960
 
0.1%
Uppercase Letter
ValueCountFrequency (%)
M 266760
44.4%
F 246786
41.0%
U 87926
 
14.6%
P 1
 
< 0.1%
Other Punctuation
ValueCountFrequency (%)
; 41
71.9%
? 15
 
26.3%
/ 1
 
1.8%
Space Separator
ValueCountFrequency (%)
1153
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 3169084
> 99.9%
Common 1210
 
< 0.1%

Most frequent character per script

Latin
ValueCountFrequency (%)
e 760879
24.0%
l 514098
16.2%
a 513820
16.2%
M 266760
 
8.4%
n 264058
 
8.3%
m 247341
 
7.8%
F 246786
 
7.8%
o 88205
 
2.8%
w 87927
 
2.8%
U 87926
 
2.8%
Other values (13) 91284
 
2.9%
Common
ValueCountFrequency (%)
1153
95.3%
; 41
 
3.4%
? 15
 
1.2%
/ 1
 
0.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 3170294
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
e 760879
24.0%
l 514098
16.2%
a 513820
16.2%
M 266760
 
8.4%
n 264058
 
8.3%
m 247341
 
7.8%
F 246786
 
7.8%
o 88205
 
2.8%
w 87927
 
2.8%
U 87926
 
2.8%
Other values (17) 92494
 
2.9%

lifeStage
Text

Missing 

Distinct91
Distinct (%)0.2%
Missing549447
Missing (%)91.4%
Memory size4.6 MiB
2025-01-14T11:48:53.689966image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length19
Median length5
Mean length6.100703792
Min length3

Characters and Unicode

Total characters317261
Distinct characters45
Distinct categories5 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique26 ?
Unique (%)< 0.1%

Sample

1st rowAdult
2nd rowAdult
3rd rowJuvenile
4th rowJuvenile
5th rowAdult
ValueCountFrequency (%)
adult 31151
58.8%
juvenile 9861
 
18.6%
immature 3907
 
7.4%
subadult 2173
 
4.1%
young 1853
 
3.5%
embryo 837
 
1.6%
fetus 684
 
1.3%
old 511
 
1.0%
nestling 499
 
0.9%
neonate 453
 
0.9%
Other values (55) 1019
 
1.9%
2025-01-14T11:48:53.832158image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
u 52076
16.4%
l 44573
14.0%
t 39354
12.4%
d 33936
10.7%
A 30440
9.6%
e 26241
8.3%
n 13312
 
4.2%
i 10584
 
3.3%
v 9888
 
3.1%
J 9846
 
3.1%
Other values (35) 47011
14.8%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 264129
83.3%
Uppercase Letter 51998
 
16.4%
Space Separator 944
 
0.3%
Other Punctuation 173
 
0.1%
Dash Punctuation 17
 
< 0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
u 52076
19.7%
l 44573
16.9%
t 39354
14.9%
d 33936
12.8%
e 26241
9.9%
n 13312
 
5.0%
i 10584
 
4.0%
v 9888
 
3.7%
m 8822
 
3.3%
a 7788
 
2.9%
Other values (13) 17555
 
6.6%
Uppercase Letter
ValueCountFrequency (%)
A 30440
58.5%
J 9846
 
18.9%
I 3944
 
7.6%
S 2221
 
4.3%
Y 1917
 
3.7%
E 998
 
1.9%
N 995
 
1.9%
F 707
 
1.4%
O 511
 
1.0%
P 154
 
0.3%
Other values (7) 265
 
0.5%
Other Punctuation
ValueCountFrequency (%)
? 108
62.4%
/ 46
26.6%
; 19
 
11.0%
Space Separator
ValueCountFrequency (%)
944
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 17
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 316127
99.6%
Common 1134
 
0.4%

Most frequent character per script

Latin
ValueCountFrequency (%)
u 52076
16.5%
l 44573
14.1%
t 39354
12.4%
d 33936
10.7%
A 30440
9.6%
e 26241
8.3%
n 13312
 
4.2%
i 10584
 
3.3%
v 9888
 
3.1%
J 9846
 
3.1%
Other values (30) 45877
14.5%
Common
ValueCountFrequency (%)
944
83.2%
? 108
 
9.5%
/ 46
 
4.1%
; 19
 
1.7%
- 17
 
1.5%

Most occurring blocks

ValueCountFrequency (%)
ASCII 317261
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
u 52076
16.4%
l 44573
14.0%
t 39354
12.4%
d 33936
10.7%
A 30440
9.6%
e 26241
8.3%
n 13312
 
4.2%
i 10584
 
3.3%
v 9888
 
3.1%
J 9846
 
3.1%
Other values (35) 47011
14.8%

preparations
Text

Missing 

Distinct542
Distinct (%)0.1%
Missing26965
Missing (%)4.5%
Memory size4.6 MiB
2025-01-14T11:48:53.897814image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length73
Median length11
Mean length10.02423558
Min length4

Characters and Unicode

Total characters5758783
Distinct characters49
Distinct categories6 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique248 ?
Unique (%)< 0.1%

Sample

1st rowSkin; Skull
2nd rowSkin; Skull
3rd rowSkin; Skull
4th rowSkin; Skull
5th rowSkin; Skull
ValueCountFrequency (%)
skull 452764
44.7%
skin 367609
36.3%
fluid 101452
 
10.0%
skeleton 36584
 
3.6%
partial 10316
 
1.0%
in 8642
 
0.9%
remainder 8641
 
0.9%
anatomical 5878
 
0.6%
baculum/baubellum 3372
 
0.3%
baleen 2349
 
0.2%
Other values (42) 14726
 
1.5%
2025-01-14T11:48:54.035678image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
l 1076304
18.7%
k 859539
14.9%
S 856659
14.9%
u 570461
9.9%
i 506031
8.8%
437847
7.6%
n 435543
7.6%
; 404417
 
7.0%
d 111124
 
1.9%
e 103346
 
1.8%
Other values (39) 397512
 
6.9%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 3909067
67.9%
Uppercase Letter 1004072
 
17.4%
Space Separator 437847
 
7.6%
Other Punctuation 407794
 
7.1%
Decimal Number 2
 
< 0.1%
Math Symbol 1
 
< 0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
l 1076304
27.5%
k 859539
22.0%
u 570461
14.6%
i 506031
12.9%
n 435543
11.1%
d 111124
 
2.8%
e 103346
 
2.6%
t 60548
 
1.5%
o 55332
 
1.4%
a 53911
 
1.4%
Other values (15) 76928
 
2.0%
Uppercase Letter
ValueCountFrequency (%)
S 856659
85.3%
F 101451
 
10.1%
P 11688
 
1.2%
B 9093
 
0.9%
R 8650
 
0.9%
A 6797
 
0.7%
T 3295
 
0.3%
H 2684
 
0.3%
O 1310
 
0.1%
M 940
 
0.1%
Other values (6) 1505
 
0.1%
Other Punctuation
ValueCountFrequency (%)
; 404417
99.2%
/ 3372
 
0.8%
, 4
 
< 0.1%
. 1
 
< 0.1%
Decimal Number
ValueCountFrequency (%)
5 1
50.0%
6 1
50.0%
Space Separator
ValueCountFrequency (%)
437847
100.0%
Math Symbol
ValueCountFrequency (%)
+ 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 4913139
85.3%
Common 845644
 
14.7%

Most frequent character per script

Latin
ValueCountFrequency (%)
l 1076304
21.9%
k 859539
17.5%
S 856659
17.4%
u 570461
11.6%
i 506031
10.3%
n 435543
8.9%
d 111124
 
2.3%
e 103346
 
2.1%
F 101451
 
2.1%
t 60548
 
1.2%
Other values (31) 232133
 
4.7%
Common
ValueCountFrequency (%)
437847
51.8%
; 404417
47.8%
/ 3372
 
0.4%
, 4
 
< 0.1%
5 1
 
< 0.1%
. 1
 
< 0.1%
6 1
 
< 0.1%
+ 1
 
< 0.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 5758783
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
l 1076304
18.7%
k 859539
14.9%
S 856659
14.9%
u 570461
9.9%
i 506031
8.8%
437847
7.6%
n 435543
7.6%
; 404417
 
7.0%
d 111124
 
1.9%
e 103346
 
1.8%
Other values (39) 397512
 
6.9%

associatedMedia
Text

Missing 

Distinct48707
Distinct (%)8.8%
Missing45503
Missing (%)7.6%
Memory size4.6 MiB
2025-01-14T11:48:54.144165image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length1263
Median length49
Mean length50.56994719
Min length48

Characters and Unicode

Total characters28114261
Distinct characters31
Distinct categories5 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique15254 ?
Unique (%)2.7%

Sample

1st rowhttps://collections.nmnh.si.edu/media/?i=14431681
2nd rowhttps://collections.nmnh.si.edu/media/?i=14603706
3rd rowhttps://collections.nmnh.si.edu/media/?i=14483098
4th rowhttps://collections.nmnh.si.edu/media/?i=14780717
5th rowhttps://collections.nmnh.si.edu/media/?i=14572646
ValueCountFrequency (%)
14887746 84
 
< 0.1%
https://collections.nmnh.si.edu/media/?i=14563406 60
 
< 0.1%
https://collections.nmnh.si.edu/media/?i=14561922 50
 
< 0.1%
https://collections.nmnh.si.edu/media/?i=14561911 50
 
< 0.1%
https://collections.nmnh.si.edu/media/?i=14561967 50
 
< 0.1%
https://collections.nmnh.si.edu/media/?i=14561909 50
 
< 0.1%
https://collections.nmnh.si.edu/media/?i=14561974 50
 
< 0.1%
https://collections.nmnh.si.edu/media/?i=14561968 50
 
< 0.1%
https://collections.nmnh.si.edu/media/?i=14561972 50
 
< 0.1%
https://collections.nmnh.si.edu/media/?i=14561943 50
 
< 0.1%
Other values (81691) 643161
99.9%
2025-01-14T11:48:54.338563image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
i 2223792
 
7.9%
/ 2223792
 
7.9%
t 1667844
 
5.9%
s 1667844
 
5.9%
. 1667844
 
5.9%
n 1667844
 
5.9%
e 1667844
 
5.9%
h 1111896
 
4.0%
d 1111896
 
4.0%
m 1111896
 
4.0%
Other values (21) 11991769
42.7%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 17234388
61.3%
Decimal Number 5144879
 
18.3%
Other Punctuation 5091289
 
18.1%
Math Symbol 555948
 
2.0%
Space Separator 87757
 
0.3%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
i 2223792
12.9%
t 1667844
9.7%
s 1667844
9.7%
n 1667844
9.7%
e 1667844
9.7%
h 1111896
 
6.5%
d 1111896
 
6.5%
m 1111896
 
6.5%
l 1111896
 
6.5%
o 1111896
 
6.5%
Other values (4) 2779740
16.1%
Decimal Number
ValueCountFrequency (%)
1 1029241
20.0%
4 925235
18.0%
6 489208
9.5%
7 463799
9.0%
5 412817
8.0%
0 384669
 
7.5%
8 371873
 
7.2%
3 367486
 
7.1%
9 358952
 
7.0%
2 341599
 
6.6%
Other Punctuation
ValueCountFrequency (%)
/ 2223792
43.7%
. 1667844
32.8%
? 555948
 
10.9%
: 555948
 
10.9%
; 87757
 
1.7%
Math Symbol
ValueCountFrequency (%)
= 555948
100.0%
Space Separator
ValueCountFrequency (%)
87757
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 17234388
61.3%
Common 10879873
38.7%

Most frequent character per script

Common
ValueCountFrequency (%)
/ 2223792
20.4%
. 1667844
15.3%
1 1029241
9.5%
4 925235
8.5%
? 555948
 
5.1%
= 555948
 
5.1%
: 555948
 
5.1%
6 489208
 
4.5%
7 463799
 
4.3%
5 412817
 
3.8%
Other values (7) 2000093
18.4%
Latin
ValueCountFrequency (%)
i 2223792
12.9%
t 1667844
9.7%
s 1667844
9.7%
n 1667844
9.7%
e 1667844
9.7%
h 1111896
 
6.5%
d 1111896
 
6.5%
m 1111896
 
6.5%
l 1111896
 
6.5%
o 1111896
 
6.5%
Other values (4) 2779740
16.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 28114261
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
i 2223792
 
7.9%
/ 2223792
 
7.9%
t 1667844
 
5.9%
s 1667844
 
5.9%
. 1667844
 
5.9%
n 1667844
 
5.9%
e 1667844
 
5.9%
h 1111896
 
4.0%
d 1111896
 
4.0%
m 1111896
 
4.0%
Other values (21) 11991769
42.7%

associatedSequences
Text

Missing 

Distinct1050
Distinct (%)99.6%
Missing600397
Missing (%)99.8%
Memory size4.6 MiB
2025-01-14T11:48:54.417620image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length699
Median length49
Mean length99.59108159
Min length47

Characters and Unicode

Total characters104969
Distinct characters58
Distinct categories6 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique1046 ?
Unique (%)99.2%

Sample

1st rowhttps://www.ncbi.nlm.nih.gov/gquery?term=AY922964|https://www.ncbi.nlm.nih.gov/gquery?term=AY922875
2nd rowhttps://www.ncbi.nlm.nih.gov/gquery?term=KC753815|https://www.ncbi.nlm.nih.gov/gquery?term=KC753933|https://www.ncbi.nlm.nih.gov/gquery?term=KC754042|https://www.ncbi.nlm.nih.gov/gquery?term=KC754162|https://www.ncbi.nlm.nih.gov/gquery?term=KC754280
3rd rowhttps://www.ncbi.nlm.nih.gov/gquery?term=KC011508|https://www.ncbi.nlm.nih.gov/gquery?term=KC011594|https://www.ncbi.nlm.nih.gov/gquery?term=KC011682
4th rowhttps://www.ncbi.nlm.nih.gov/gquery?term=MN707485|https://www.ncbi.nlm.nih.gov/gquery?term=MN707432
5th rowhttps://www.ncbi.nlm.nih.gov/gquery?term=JQ317640|https://www.ncbi.nlm.nih.gov/gquery?term=JQ317668
ValueCountFrequency (%)
https://www.ncbi.nlm.nih.gov/gquery?term=eu021073 2
 
0.2%
https://www.ncbi.nlm.nih.gov/gquery?term=fj383131 2
 
0.2%
https://www.ncbi.nlm.nih.gov/gquery?term=kx998919 2
 
0.2%
https://www.ncbi.nlm.nih.gov/gquery?term=eu021074 2
 
0.2%
https://www.ncbi.nlm.nih.gov/gquery?term=dq178333|https://www.ncbi.nlm.nih.gov/gquery?term=dq178344 1
 
0.1%
https://www.ncbi.nlm.nih.gov/gquery?term=ay974630|https://www.ncbi.nlm.nih.gov/gquery?term=ay974676 1
 
0.1%
https://www.ncbi.nlm.nih.gov/gquery?term=kc753815|https://www.ncbi.nlm.nih.gov/gquery?term=kc753933|https://www.ncbi.nlm.nih.gov/gquery?term=kc754042|https://www.ncbi.nlm.nih.gov/gquery?term=kc754162|https://www.ncbi.nlm.nih.gov/gquery?term=kc754280 1
 
0.1%
https://www.ncbi.nlm.nih.gov/gquery?term=kc011508|https://www.ncbi.nlm.nih.gov/gquery?term=kc011594|https://www.ncbi.nlm.nih.gov/gquery?term=kc011682 1
 
0.1%
https://www.ncbi.nlm.nih.gov/gquery?term=mn707485|https://www.ncbi.nlm.nih.gov/gquery?term=mn707432 1
 
0.1%
https://www.ncbi.nlm.nih.gov/gquery?term=jq317640|https://www.ncbi.nlm.nih.gov/gquery?term=jq317668 1
 
0.1%
Other values (1040) 1040
98.7%
2025-01-14T11:48:54.552901image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
. 8515
 
8.1%
/ 6360
 
6.1%
w 6360
 
6.1%
n 6360
 
6.1%
t 6360
 
6.1%
h 4240
 
4.0%
r 4240
 
4.0%
e 4240
 
4.0%
i 4240
 
4.0%
m 4240
 
4.0%
Other values (48) 49814
47.5%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 65720
62.6%
Other Punctuation 19115
 
18.2%
Decimal Number 12730
 
12.1%
Uppercase Letter 4213
 
4.0%
Math Symbol 3186
 
3.0%
Connector Punctuation 5
 
< 0.1%

Most frequent character per category

Uppercase Letter
ValueCountFrequency (%)
K 814
19.3%
M 721
17.1%
N 422
10.0%
Y 404
9.6%
A 392
9.3%
T 258
 
6.1%
F 237
 
5.6%
J 212
 
5.0%
C 171
 
4.1%
Q 146
 
3.5%
Other values (12) 436
10.3%
Lowercase Letter
ValueCountFrequency (%)
w 6360
 
9.7%
n 6360
 
9.7%
t 6360
 
9.7%
h 4240
 
6.5%
r 4240
 
6.5%
e 4240
 
6.5%
i 4240
 
6.5%
m 4240
 
6.5%
g 4240
 
6.5%
v 2120
 
3.2%
Other values (9) 19080
29.0%
Decimal Number
ValueCountFrequency (%)
7 1517
11.9%
3 1452
11.4%
6 1407
11.1%
9 1389
10.9%
2 1352
10.6%
4 1216
9.6%
8 1213
9.5%
1 1128
8.9%
5 1094
8.6%
0 962
7.6%
Other Punctuation
ValueCountFrequency (%)
. 8515
44.5%
/ 6360
33.3%
? 2120
 
11.1%
: 2120
 
11.1%
Math Symbol
ValueCountFrequency (%)
= 2120
66.5%
| 1066
33.5%
Connector Punctuation
ValueCountFrequency (%)
_ 5
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 69933
66.6%
Common 35036
33.4%

Most frequent character per script

Latin
ValueCountFrequency (%)
w 6360
 
9.1%
n 6360
 
9.1%
t 6360
 
9.1%
h 4240
 
6.1%
r 4240
 
6.1%
e 4240
 
6.1%
i 4240
 
6.1%
m 4240
 
6.1%
g 4240
 
6.1%
v 2120
 
3.0%
Other values (31) 23293
33.3%
Common
ValueCountFrequency (%)
. 8515
24.3%
/ 6360
18.2%
? 2120
 
6.1%
: 2120
 
6.1%
= 2120
 
6.1%
7 1517
 
4.3%
3 1452
 
4.1%
6 1407
 
4.0%
9 1389
 
4.0%
2 1352
 
3.9%
Other values (7) 6684
19.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 104969
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
. 8515
 
8.1%
/ 6360
 
6.1%
w 6360
 
6.1%
n 6360
 
6.1%
t 6360
 
6.1%
h 4240
 
4.0%
r 4240
 
4.0%
e 4240
 
4.0%
i 4240
 
4.0%
m 4240
 
4.0%
Other values (48) 49814
47.5%

occurrenceRemarks
Text

Missing 

Distinct5322
Distinct (%)49.3%
Missing590662
Missing (%)98.2%
Memory size4.6 MiB
2025-01-14T11:48:54.759878image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length44804
Median length2082
Mean length214.0076003
Min length4

Characters and Unicode

Total characters2308928
Distinct characters158
Distinct categories18 ?
Distinct scripts3 ?
Distinct blocks4 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique4721 ?
Unique (%)43.8%

Sample

1st rowFrom ledger catalogue 577876-577900: "field data recorded from field catalogues"
2nd rowSkin found in rotunda hallway hold-up case, 2017. May need tanning before installation into collection.
3rd rowLectotype designated by Avila Pires (1968:163).
4th rowSkull removed from alcoholic specimen.
5th rowMore than 800 dolphins stranded along a 220 km stretch pof the coast of Peru. See STR18239.; Broccetto, Marilia CNN website 22 IV 2012
ValueCountFrequency (%)
the 13880
 
3.8%
of 9359
 
2.6%
and 7684
 
2.1%
in 7077
 
1.9%
for 6435
 
1.8%
to 6041
 
1.6%
4896
 
1.3%
on 4761
 
1.3%
was 4231
 
1.2%
from 3875
 
1.1%
Other values (19019) 298259
81.4%
2025-01-14T11:48:55.028230image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
355709
15.4%
e 205843
 
8.9%
a 147185
 
6.4%
t 125245
 
5.4%
o 122482
 
5.3%
n 120296
 
5.2%
i 111994
 
4.9%
s 111800
 
4.8%
r 110930
 
4.8%
l 77896
 
3.4%
Other values (148) 819548
35.5%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 1587531
68.8%
Space Separator 355709
 
15.4%
Uppercase Letter 132353
 
5.7%
Decimal Number 122350
 
5.3%
Other Punctuation 87540
 
3.8%
Dash Punctuation 8132
 
0.4%
Close Punctuation 6920
 
0.3%
Open Punctuation 6894
 
0.3%
Math Symbol 680
 
< 0.1%
Connector Punctuation 461
 
< 0.1%
Other values (8) 358
 
< 0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
e 205843
13.0%
a 147185
 
9.3%
t 125245
 
7.9%
o 122482
 
7.7%
n 120296
 
7.6%
i 111994
 
7.1%
s 111800
 
7.0%
r 110930
 
7.0%
l 77896
 
4.9%
d 65194
 
4.1%
Other values (53) 388666
24.5%
Uppercase Letter
ValueCountFrequency (%)
S 13793
 
10.4%
M 11265
 
8.5%
N 10762
 
8.1%
T 10560
 
8.0%
C 8190
 
6.2%
F 7728
 
5.8%
I 7523
 
5.7%
A 7439
 
5.6%
B 6332
 
4.8%
R 5318
 
4.0%
Other values (18) 43443
32.8%
Other Punctuation
ValueCountFrequency (%)
. 36734
42.0%
, 26137
29.9%
: 6493
 
7.4%
" 5631
 
6.4%
; 4846
 
5.5%
/ 3229
 
3.7%
' 1865
 
2.1%
# 977
 
1.1%
& 535
 
0.6%
? 299
 
0.3%
Other values (12) 794
 
0.9%
Decimal Number
ValueCountFrequency (%)
1 20642
16.9%
0 20306
16.6%
2 20036
16.4%
5 10174
8.3%
9 10101
8.3%
7 9447
7.7%
6 8256
 
6.7%
3 8246
 
6.7%
4 7859
 
6.4%
8 7283
 
6.0%
Math Symbol
ValueCountFrequency (%)
= 207
30.4%
+ 203
29.9%
~ 120
17.6%
< 79
 
11.6%
> 62
 
9.1%
| 4
 
0.6%
± 2
 
0.3%
¬ 2
 
0.3%
1
 
0.1%
Other Number
ValueCountFrequency (%)
½ 29
65.9%
¼ 7
 
15.9%
¹ 5
 
11.4%
¾ 3
 
6.8%
Dash Punctuation
ValueCountFrequency (%)
- 7459
91.7%
656
 
8.1%
17
 
0.2%
Close Punctuation
ValueCountFrequency (%)
) 6315
91.3%
] 602
 
8.7%
} 3
 
< 0.1%
Open Punctuation
ValueCountFrequency (%)
( 6291
91.3%
[ 600
 
8.7%
{ 3
 
< 0.1%
Final Punctuation
ValueCountFrequency (%)
90
98.9%
» 1
 
1.1%
Currency Symbol
ValueCountFrequency (%)
$ 48
82.8%
¥ 10
 
17.2%
Format
ValueCountFrequency (%)
3
60.0%
 2
40.0%
Modifier Symbol
ValueCountFrequency (%)
´ 1
50.0%
^ 1
50.0%
Space Separator
ValueCountFrequency (%)
355709
100.0%
Connector Punctuation
ValueCountFrequency (%)
_ 461
100.0%
Initial Punctuation
ValueCountFrequency (%)
83
100.0%
Other Symbol
ValueCountFrequency (%)
° 67
100.0%
Other Letter
ValueCountFrequency (%)
º 8
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 1719816
74.5%
Common 589036
 
25.5%
Greek 76
 
< 0.1%

Most frequent character per script

Latin
ValueCountFrequency (%)
e 205843
 
12.0%
a 147185
 
8.6%
t 125245
 
7.3%
o 122482
 
7.1%
n 120296
 
7.0%
i 111994
 
6.5%
s 111800
 
6.5%
r 110930
 
6.5%
l 77896
 
4.5%
d 65194
 
3.8%
Other values (70) 520951
30.3%
Common
ValueCountFrequency (%)
355709
60.4%
. 36734
 
6.2%
, 26137
 
4.4%
1 20642
 
3.5%
0 20306
 
3.4%
2 20036
 
3.4%
5 10174
 
1.7%
9 10101
 
1.7%
7 9447
 
1.6%
6 8256
 
1.4%
Other values (56) 71494
 
12.1%
Greek
ValueCountFrequency (%)
μ 64
84.2%
ο 2
 
2.6%
ή 1
 
1.3%
ϊ 1
 
1.3%
ι 1
 
1.3%
ν 1
 
1.3%
ρ 1
 
1.3%
υ 1
 
1.3%
δ 1
 
1.3%
α 1
 
1.3%
Other values (2) 2
 
2.6%

Most occurring blocks

ValueCountFrequency (%)
ASCII 2307432
99.9%
Punctuation 858
 
< 0.1%
None 637
 
< 0.1%
Math Operators 1
 
< 0.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
355709
15.4%
e 205843
 
8.9%
a 147185
 
6.4%
t 125245
 
5.4%
o 122482
 
5.3%
n 120296
 
5.2%
i 111994
 
4.9%
s 111800
 
4.8%
r 110930
 
4.8%
l 77896
 
3.4%
Other values (84) 818052
35.5%
Punctuation
ValueCountFrequency (%)
656
76.5%
90
 
10.5%
83
 
9.7%
17
 
2.0%
4
 
0.5%
3
 
0.3%
2
 
0.2%
2
 
0.2%
1
 
0.1%
None
ValueCountFrequency (%)
· 170
26.7%
é 78
12.2%
° 67
 
10.5%
μ 64
 
10.0%
ì 58
 
9.1%
½ 29
 
4.6%
è 20
 
3.1%
Ö 12
 
1.9%
ä 10
 
1.6%
ü 10
 
1.6%
Other values (44) 119
18.7%
Math Operators
ValueCountFrequency (%)
1
100.0%

eventDate
Text

Missing 

Distinct46637
Distinct (%)8.1%
Missing28127
Missing (%)4.7%
Memory size4.6 MiB
2025-01-14T11:48:55.254349image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length21
Median length10
Mean length9.727609519
Min length4

Characters and Unicode

Total characters5577072
Distinct characters12
Distinct categories3 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique7673 ?
Unique (%)1.3%

Sample

1st row1989-02-28
2nd row1917-08-08
3rd row1966-05
4th row1894-07-15
5th row1992-11-05
ValueCountFrequency (%)
1968 1160
 
0.2%
1959 769
 
0.1%
1965-06 704
 
0.1%
1966-06-02 682
 
0.1%
1903 600
 
0.1%
1905 591
 
0.1%
1965 543
 
0.1%
1967-08 537
 
0.1%
1967-05 529
 
0.1%
1968-09-02 520
 
0.1%
Other values (46627) 566689
98.8%
2025-01-14T11:48:55.541466image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
1 1092495
19.6%
- 1092362
19.6%
0 833561
14.9%
9 717766
12.9%
2 392002
 
7.0%
6 323478
 
5.8%
8 309247
 
5.5%
7 251593
 
4.5%
3 195678
 
3.5%
5 191966
 
3.4%
Other values (2) 176924
 
3.2%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 4483372
80.4%
Dash Punctuation 1092362
 
19.6%
Other Punctuation 1338
 
< 0.1%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
1 1092495
24.4%
0 833561
18.6%
9 717766
16.0%
2 392002
 
8.7%
6 323478
 
7.2%
8 309247
 
6.9%
7 251593
 
5.6%
3 195678
 
4.4%
5 191966
 
4.3%
4 175586
 
3.9%
Dash Punctuation
ValueCountFrequency (%)
- 1092362
100.0%
Other Punctuation
ValueCountFrequency (%)
/ 1338
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 5577072
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
1 1092495
19.6%
- 1092362
19.6%
0 833561
14.9%
9 717766
12.9%
2 392002
 
7.0%
6 323478
 
5.8%
8 309247
 
5.5%
7 251593
 
4.5%
3 195678
 
3.5%
5 191966
 
3.4%
Other values (2) 176924
 
3.2%

Most occurring blocks

ValueCountFrequency (%)
ASCII 5577072
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
1 1092495
19.6%
- 1092362
19.6%
0 833561
14.9%
9 717766
12.9%
2 392002
 
7.0%
6 323478
 
5.8%
8 309247
 
5.5%
7 251593
 
4.5%
3 195678
 
3.5%
5 191966
 
3.4%
Other values (2) 176924
 
3.2%

startDayOfYear
Text

Missing 

Distinct366
Distinct (%)0.1%
Missing46793
Missing (%)7.8%
Memory size4.6 MiB
2025-01-14T11:48:55.757042image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length3
Median length3
Mean length2.724276942
Min length1

Characters and Unicode

Total characters1511042
Distinct characters10
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row59
2nd row220
3rd row151
4th row196
5th row310
ValueCountFrequency (%)
181 3910
 
0.7%
59 3214
 
0.6%
243 3136
 
0.6%
212 3000
 
0.5%
151 2957
 
0.5%
213 2690
 
0.5%
120 2635
 
0.5%
334 2476
 
0.4%
193 2428
 
0.4%
304 2382
 
0.4%
Other values (356) 525830
94.8%
2025-01-14T11:48:56.029748image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
1 288485
19.1%
2 278737
18.4%
3 192107
12.7%
5 114772
 
7.6%
4 114286
 
7.6%
6 108886
 
7.2%
0 104494
 
6.9%
7 103973
 
6.9%
9 103679
 
6.9%
8 101623
 
6.7%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 1511042
100.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
1 288485
19.1%
2 278737
18.4%
3 192107
12.7%
5 114772
 
7.6%
4 114286
 
7.6%
6 108886
 
7.2%
0 104494
 
6.9%
7 103973
 
6.9%
9 103679
 
6.9%
8 101623
 
6.7%

Most occurring scripts

ValueCountFrequency (%)
Common 1511042
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
1 288485
19.1%
2 278737
18.4%
3 192107
12.7%
5 114772
 
7.6%
4 114286
 
7.6%
6 108886
 
7.2%
0 104494
 
6.9%
7 103973
 
6.9%
9 103679
 
6.9%
8 101623
 
6.7%

Most occurring blocks

ValueCountFrequency (%)
ASCII 1511042
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
1 288485
19.1%
2 278737
18.4%
3 192107
12.7%
5 114772
 
7.6%
4 114286
 
7.6%
6 108886
 
7.2%
0 104494
 
6.9%
7 103973
 
6.9%
9 103679
 
6.9%
8 101623
 
6.7%

endDayOfYear
Text

Missing 

Distinct366
Distinct (%)0.1%
Missing46765
Missing (%)7.8%
Memory size4.6 MiB
2025-01-14T11:48:56.240133image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length3
Median length3
Mean length2.724321508
Min length1

Characters and Unicode

Total characters1511143
Distinct characters10
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row59
2nd row220
3rd row151
4th row196
5th row310
ValueCountFrequency (%)
181 3912
 
0.7%
59 3215
 
0.6%
243 3146
 
0.6%
151 3016
 
0.5%
212 2960
 
0.5%
213 2646
 
0.5%
120 2638
 
0.5%
334 2464
 
0.4%
304 2406
 
0.4%
222 2369
 
0.4%
Other values (356) 525914
94.8%
2025-01-14T11:48:56.509573image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
1 288287
19.1%
2 278817
18.5%
3 192047
12.7%
5 114832
 
7.6%
4 114656
 
7.6%
6 108777
 
7.2%
0 104587
 
6.9%
7 103968
 
6.9%
9 103568
 
6.9%
8 101604
 
6.7%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 1511143
100.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
1 288287
19.1%
2 278817
18.5%
3 192047
12.7%
5 114832
 
7.6%
4 114656
 
7.6%
6 108777
 
7.2%
0 104587
 
6.9%
7 103968
 
6.9%
9 103568
 
6.9%
8 101604
 
6.7%

Most occurring scripts

ValueCountFrequency (%)
Common 1511143
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
1 288287
19.1%
2 278817
18.5%
3 192047
12.7%
5 114832
 
7.6%
4 114656
 
7.6%
6 108777
 
7.2%
0 104587
 
6.9%
7 103968
 
6.9%
9 103568
 
6.9%
8 101604
 
6.7%

Most occurring blocks

ValueCountFrequency (%)
ASCII 1511143
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
1 288287
19.1%
2 278817
18.5%
3 192047
12.7%
5 114832
 
7.6%
4 114656
 
7.6%
6 108777
 
7.2%
0 104587
 
6.9%
7 103968
 
6.9%
9 103568
 
6.9%
8 101604
 
6.7%

year
Text

Missing 

Distinct350
Distinct (%)0.1%
Missing28127
Missing (%)4.7%
Memory size4.6 MiB
2025-01-14T11:48:56.715816image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length4
Median length4
Mean length4
Min length4

Characters and Unicode

Total characters2293296
Distinct characters10
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique74 ?
Unique (%)< 0.1%

Sample

1st row1989
2nd row1917
3rd row1966
4th row1894
5th row1992
ValueCountFrequency (%)
1967 30814
 
5.4%
1968 27037
 
4.7%
1966 22575
 
3.9%
1969 15259
 
2.7%
1965 12690
 
2.2%
1964 12541
 
2.2%
1962 11211
 
2.0%
1970 10527
 
1.8%
1916 9955
 
1.7%
1963 9798
 
1.7%
Other values (340) 410917
71.7%
2025-01-14T11:48:56.972929image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
1 670022
29.2%
9 621720
27.1%
6 214950
 
9.4%
8 199846
 
8.7%
7 134632
 
5.9%
0 133037
 
5.8%
5 87362
 
3.8%
2 86888
 
3.8%
4 76111
 
3.3%
3 68728
 
3.0%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 2293296
100.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
1 670022
29.2%
9 621720
27.1%
6 214950
 
9.4%
8 199846
 
8.7%
7 134632
 
5.9%
0 133037
 
5.8%
5 87362
 
3.8%
2 86888
 
3.8%
4 76111
 
3.3%
3 68728
 
3.0%

Most occurring scripts

ValueCountFrequency (%)
Common 2293296
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
1 670022
29.2%
9 621720
27.1%
6 214950
 
9.4%
8 199846
 
8.7%
7 134632
 
5.9%
0 133037
 
5.8%
5 87362
 
3.8%
2 86888
 
3.8%
4 76111
 
3.3%
3 68728
 
3.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 2293296
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
1 670022
29.2%
9 621720
27.1%
6 214950
 
9.4%
8 199846
 
8.7%
7 134632
 
5.9%
0 133037
 
5.8%
5 87362
 
3.8%
2 86888
 
3.8%
4 76111
 
3.3%
3 68728
 
3.0%

month
Text

Missing 

Distinct12
Distinct (%)< 0.1%
Missing44866
Missing (%)7.5%
Memory size4.6 MiB
2025-01-14T11:48:57.037742image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length2
Median length1
Mean length1.192750433
Min length1

Characters and Unicode

Total characters663867
Distinct characters10
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row2
2nd row8
3rd row5
4th row7
5th row11
ValueCountFrequency (%)
7 63622
11.4%
8 55632
10.0%
6 55508
10.0%
3 50988
9.2%
5 50119
9.0%
4 46824
8.4%
9 43994
7.9%
2 43078
7.7%
10 40461
7.3%
1 39538
7.1%
Other values (2) 66821
12.0%
2025-01-14T11:48:57.140327image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
1 182088
27.4%
2 74631
11.2%
7 63622
 
9.6%
8 55632
 
8.4%
6 55508
 
8.4%
3 50988
 
7.7%
5 50119
 
7.5%
4 46824
 
7.1%
9 43994
 
6.6%
0 40461
 
6.1%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 663867
100.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
1 182088
27.4%
2 74631
11.2%
7 63622
 
9.6%
8 55632
 
8.4%
6 55508
 
8.4%
3 50988
 
7.7%
5 50119
 
7.5%
4 46824
 
7.1%
9 43994
 
6.6%
0 40461
 
6.1%

Most occurring scripts

ValueCountFrequency (%)
Common 663867
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
1 182088
27.4%
2 74631
11.2%
7 63622
 
9.6%
8 55632
 
8.4%
6 55508
 
8.4%
3 50988
 
7.7%
5 50119
 
7.5%
4 46824
 
7.1%
9 43994
 
6.6%
0 40461
 
6.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 663867
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
1 182088
27.4%
2 74631
11.2%
7 63622
 
9.6%
8 55632
 
8.4%
6 55508
 
8.4%
3 50988
 
7.7%
5 50119
 
7.5%
4 46824
 
7.1%
9 43994
 
6.6%
0 40461
 
6.1%

day
Text

Missing 

Distinct31
Distinct (%)< 0.1%
Missing67482
Missing (%)11.2%
Memory size4.6 MiB
2025-01-14T11:48:57.211535image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length2
Median length2
Mean length1.708157215
Min length1

Characters and Unicode

Total characters912103
Distinct characters10
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row28
2nd row8
3rd row15
4th row5
5th row18
ValueCountFrequency (%)
10 19188
 
3.6%
20 18614
 
3.5%
22 18464
 
3.5%
15 18400
 
3.4%
18 18199
 
3.4%
14 18001
 
3.4%
5 17933
 
3.4%
16 17919
 
3.4%
27 17835
 
3.3%
21 17818
 
3.3%
Other values (21) 351598
65.8%
2025-01-14T11:48:57.345485image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
1 238401
26.1%
2 229686
25.2%
3 75593
 
8.3%
5 53889
 
5.9%
0 53247
 
5.8%
8 53132
 
5.8%
7 52819
 
5.8%
6 52526
 
5.8%
4 52154
 
5.7%
9 50656
 
5.6%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 912103
100.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
1 238401
26.1%
2 229686
25.2%
3 75593
 
8.3%
5 53889
 
5.9%
0 53247
 
5.8%
8 53132
 
5.8%
7 52819
 
5.8%
6 52526
 
5.8%
4 52154
 
5.7%
9 50656
 
5.6%

Most occurring scripts

ValueCountFrequency (%)
Common 912103
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
1 238401
26.1%
2 229686
25.2%
3 75593
 
8.3%
5 53889
 
5.9%
0 53247
 
5.8%
8 53132
 
5.8%
7 52819
 
5.8%
6 52526
 
5.8%
4 52154
 
5.7%
9 50656
 
5.6%

Most occurring blocks

ValueCountFrequency (%)
ASCII 912103
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
1 238401
26.1%
2 229686
25.2%
3 75593
 
8.3%
5 53889
 
5.9%
0 53247
 
5.8%
8 53132
 
5.8%
7 52819
 
5.8%
6 52526
 
5.8%
4 52154
 
5.7%
9 50656
 
5.6%

verbatimEventDate
Text

Missing 

Distinct45124
Distinct (%)8.0%
Missing36490
Missing (%)6.1%
Memory size4.6 MiB
2025-01-14T11:48:57.528522image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length82
Median length11
Mean length10.73425953
Min length3

Characters and Unicode

Total characters6064438
Distinct characters75
Distinct categories9 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique7925 ?
Unique (%)1.4%

Sample

1st row28 Feb 1989
2nd row8 Aug 1917
3rd row-- May 1966
4th row15 Jul 1894
5th row5 Nov 1992
ValueCountFrequency (%)
119289
 
7.0%
jul 59029
 
3.5%
aug 52663
 
3.1%
jun 52253
 
3.1%
mar 49098
 
2.9%
may 47959
 
2.8%
apr 45015
 
2.6%
sep 41961
 
2.5%
feb 40432
 
2.4%
oct 39123
 
2.3%
Other values (873) 1153619
67.8%
2025-01-14T11:48:57.799039image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
1135480
18.7%
1 869039
14.3%
9 644744
 
10.6%
2 290400
 
4.8%
- 284559
 
4.7%
6 256804
 
4.2%
8 242113
 
4.0%
7 176263
 
2.9%
u 165038
 
2.7%
0 163304
 
2.7%
Other values (65) 1836694
30.3%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 3034227
50.0%
Space Separator 1135480
 
18.7%
Lowercase Letter 1072102
 
17.7%
Uppercase Letter 534667
 
8.8%
Dash Punctuation 284559
 
4.7%
Other Punctuation 3387
 
0.1%
Close Punctuation 7
 
< 0.1%
Open Punctuation 6
 
< 0.1%
Math Symbol 3
 
< 0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
u 165038
15.4%
a 133875
12.5%
e 114602
10.7%
r 97161
9.1%
n 90730
8.5%
p 87414
8.2%
c 68763
6.4%
l 60684
 
5.7%
g 53357
 
5.0%
y 47929
 
4.5%
Other values (14) 152549
14.2%
Uppercase Letter
ValueCountFrequency (%)
J 147559
27.6%
A 97950
18.3%
M 97151
18.2%
S 43634
 
8.2%
F 41188
 
7.7%
O 39198
 
7.3%
N 33829
 
6.3%
D 30011
 
5.6%
W 1456
 
0.3%
E 615
 
0.1%
Other values (13) 2076
 
0.4%
Decimal Number
ValueCountFrequency (%)
1 869039
28.6%
9 644744
21.2%
2 290400
 
9.6%
6 256804
 
8.5%
8 242113
 
8.0%
7 176263
 
5.8%
0 163304
 
5.4%
3 136893
 
4.5%
5 134478
 
4.4%
4 120189
 
4.0%
Other Punctuation
ValueCountFrequency (%)
* 2267
66.9%
, 926
27.3%
? 105
 
3.1%
: 53
 
1.6%
/ 21
 
0.6%
. 6
 
0.2%
' 5
 
0.1%
& 2
 
0.1%
; 2
 
0.1%
Math Symbol
ValueCountFrequency (%)
= 1
33.3%
< 1
33.3%
~ 1
33.3%
Close Punctuation
ValueCountFrequency (%)
) 6
85.7%
] 1
 
14.3%
Open Punctuation
ValueCountFrequency (%)
( 5
83.3%
[ 1
 
16.7%
Space Separator
ValueCountFrequency (%)
1135480
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 284559
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 4457669
73.5%
Latin 1606769
 
26.5%

Most frequent character per script

Latin
ValueCountFrequency (%)
u 165038
 
10.3%
J 147559
 
9.2%
a 133875
 
8.3%
e 114602
 
7.1%
A 97950
 
6.1%
r 97161
 
6.0%
M 97151
 
6.0%
n 90730
 
5.6%
p 87414
 
5.4%
c 68763
 
4.3%
Other values (37) 506526
31.5%
Common
ValueCountFrequency (%)
1135480
25.5%
1 869039
19.5%
9 644744
14.5%
2 290400
 
6.5%
- 284559
 
6.4%
6 256804
 
5.8%
8 242113
 
5.4%
7 176263
 
4.0%
0 163304
 
3.7%
3 136893
 
3.1%
Other values (18) 258070
 
5.8%

Most occurring blocks

ValueCountFrequency (%)
ASCII 6064438
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
1135480
18.7%
1 869039
14.3%
9 644744
 
10.6%
2 290400
 
4.8%
- 284559
 
4.7%
6 256804
 
4.2%
8 242113
 
4.0%
7 176263
 
2.9%
u 165038
 
2.7%
0 163304
 
2.7%
Other values (65) 1836694
30.3%

habitat
Text

Missing 

Distinct7512
Distinct (%)5.7%
Missing468915
Missing (%)78.0%
Memory size4.6 MiB
2025-01-14T11:48:57.995625image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length1014
Median length694
Mean length27.3692808
Min length1

Characters and Unicode

Total characters3627415
Distinct characters86
Distinct categories10 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique4415 ?
Unique (%)3.3%

Sample

1st rowEcological remarks by collector(s): yes
2nd rowPremontane very humid forest
3rd rowEcological remarks by collector(s): no
4th rowEcological remarks by collector(s): yes
5th rowCulvert
ValueCountFrequency (%)
by 49297
 
9.4%
ecological 48727
 
9.3%
remarks 48718
 
9.3%
collector(s 48716
 
9.3%
yes 41564
 
8.0%
forest 32139
 
6.2%
tropical 15058
 
2.9%
humid 14768
 
2.8%
no 7275
 
1.4%
in 6943
 
1.3%
Other values (3497) 208498
40.0%
2025-01-14T11:48:58.263530image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
389167
 
10.7%
o 316538
 
8.7%
e 293307
 
8.1%
r 281112
 
7.7%
l 253946
 
7.0%
s 244547
 
6.7%
c 240040
 
6.6%
a 233816
 
6.4%
i 137021
 
3.8%
t 136017
 
3.7%
Other values (76) 1101904
30.4%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 2931962
80.8%
Space Separator 389167
 
10.7%
Uppercase Letter 134371
 
3.7%
Other Punctuation 62424
 
1.7%
Open Punctuation 49723
 
1.4%
Close Punctuation 49712
 
1.4%
Decimal Number 6872
 
0.2%
Dash Punctuation 3142
 
0.1%
Math Symbol 40
 
< 0.1%
Final Punctuation 2
 
< 0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
o 316538
10.8%
e 293307
10.0%
r 281112
9.6%
l 253946
 
8.7%
s 244547
 
8.3%
c 240040
 
8.2%
a 233816
 
8.0%
i 137021
 
4.7%
t 136017
 
4.6%
y 117063
 
4.0%
Other values (16) 678555
23.1%
Uppercase Letter
ValueCountFrequency (%)
E 49837
37.1%
T 18330
 
13.6%
S 10045
 
7.5%
R 7675
 
5.7%
P 6589
 
4.9%
G 6219
 
4.6%
C 4362
 
3.2%
M 4095
 
3.0%
A 3747
 
2.8%
B 3506
 
2.6%
Other values (16) 19966
14.9%
Other Punctuation
ValueCountFrequency (%)
: 48943
78.4%
, 7291
 
11.7%
. 4022
 
6.4%
; 832
 
1.3%
" 403
 
0.6%
& 381
 
0.6%
/ 229
 
0.4%
? 145
 
0.2%
' 102
 
0.2%
# 62
 
0.1%
Other values (3) 14
 
< 0.1%
Decimal Number
ValueCountFrequency (%)
0 2599
37.8%
1 1142
16.6%
2 872
 
12.7%
3 636
 
9.3%
5 469
 
6.8%
4 334
 
4.9%
8 251
 
3.7%
6 220
 
3.2%
7 185
 
2.7%
9 164
 
2.4%
Close Punctuation
ValueCountFrequency (%)
) 49366
99.3%
] 345
 
0.7%
} 1
 
< 0.1%
Math Symbol
ValueCountFrequency (%)
= 33
82.5%
+ 5
 
12.5%
~ 2
 
5.0%
Open Punctuation
ValueCountFrequency (%)
( 49378
99.3%
[ 345
 
0.7%
Space Separator
ValueCountFrequency (%)
389167
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 3142
100.0%
Final Punctuation
ValueCountFrequency (%)
2
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 3066333
84.5%
Common 561082
 
15.5%

Most frequent character per script

Latin
ValueCountFrequency (%)
o 316538
 
10.3%
e 293307
 
9.6%
r 281112
 
9.2%
l 253946
 
8.3%
s 244547
 
8.0%
c 240040
 
7.8%
a 233816
 
7.6%
i 137021
 
4.5%
t 136017
 
4.4%
y 117063
 
3.8%
Other values (42) 812926
26.5%
Common
ValueCountFrequency (%)
389167
69.4%
( 49378
 
8.8%
) 49366
 
8.8%
: 48943
 
8.7%
, 7291
 
1.3%
. 4022
 
0.7%
- 3142
 
0.6%
0 2599
 
0.5%
1 1142
 
0.2%
2 872
 
0.2%
Other values (24) 5160
 
0.9%

Most occurring blocks

ValueCountFrequency (%)
ASCII 3627413
> 99.9%
Punctuation 2
 
< 0.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
389167
 
10.7%
o 316538
 
8.7%
e 293307
 
8.1%
r 281112
 
7.7%
l 253946
 
7.0%
s 244547
 
6.7%
c 240040
 
6.6%
a 233816
 
6.4%
i 137021
 
3.8%
t 136017
 
3.7%
Other values (75) 1101902
30.4%
Punctuation
ValueCountFrequency (%)
2
100.0%
Distinct8925
Distinct (%)1.5%
Missing440
Missing (%)0.1%
Memory size4.6 MiB
2025-01-14T11:48:58.460028image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length146
Median length124
Mean length39.09340095
Min length4

Characters and Unicode

Total characters23495564
Distinct characters91
Distinct categories10 ?
Distinct scripts2 ?
Distinct blocks5 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique3023 ?
Unique (%)0.5%

Sample

1st rowNorth America, Panama, Bocas Del Toro
2nd rowNorth America, United States, Utah
3rd rowSouth America, Venezuela, Bolivar
4th rowNorth America, Mexico, Oaxaca
5th rowNorth America, North Atlantic Ocean, United States, North Carolina, Carteret
ValueCountFrequency (%)
america 390243
 
12.4%
north 378352
 
12.1%
united 229925
 
7.3%
states 225212
 
7.2%
africa 111667
 
3.6%
south 90792
 
2.9%
county 80759
 
2.6%
asia 66157
 
2.1%
ocean 58408
 
1.9%
mexico 50692
 
1.6%
Other values (5566) 1452640
46.3%
2025-01-14T11:48:58.735586image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
2533836
 
10.8%
a 2342309
 
10.0%
i 1683292
 
7.2%
t 1628350
 
6.9%
e 1586909
 
6.8%
r 1444280
 
6.1%
, 1372561
 
5.8%
o 1263879
 
5.4%
n 1236327
 
5.3%
c 879180
 
3.7%
Other values (81) 7524641
32.0%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 16409922
69.8%
Uppercase Letter 3147373
 
13.4%
Space Separator 2533836
 
10.8%
Other Punctuation 1384733
 
5.9%
Dash Punctuation 19470
 
0.1%
Open Punctuation 106
 
< 0.1%
Close Punctuation 106
 
< 0.1%
Decimal Number 12
 
< 0.1%
Modifier Letter 5
 
< 0.1%
Math Symbol 1
 
< 0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
a 2342309
14.3%
i 1683292
10.3%
t 1628350
9.9%
e 1586909
9.7%
r 1444280
8.8%
o 1263879
7.7%
n 1236327
7.5%
c 879180
 
5.4%
s 644321
 
3.9%
h 637727
 
3.9%
Other values (35) 3063348
18.7%
Uppercase Letter
ValueCountFrequency (%)
A 694612
22.1%
N 456008
14.5%
S 407690
13.0%
U 266626
 
8.5%
C 259605
 
8.2%
M 141875
 
4.5%
P 124642
 
4.0%
O 99864
 
3.2%
B 97350
 
3.1%
T 70558
 
2.2%
Other values (17) 528543
16.8%
Other Punctuation
ValueCountFrequency (%)
, 1372561
99.1%
' 7365
 
0.5%
. 3951
 
0.3%
? 630
 
< 0.1%
* 122
 
< 0.1%
/ 103
 
< 0.1%
: 1
 
< 0.1%
Decimal Number
ValueCountFrequency (%)
4 4
33.3%
2 4
33.3%
1 2
16.7%
0 1
 
8.3%
8 1
 
8.3%
Dash Punctuation
ValueCountFrequency (%)
- 19466
> 99.9%
4
 
< 0.1%
Space Separator
ValueCountFrequency (%)
2533836
100.0%
Open Punctuation
ValueCountFrequency (%)
( 106
100.0%
Close Punctuation
ValueCountFrequency (%)
) 106
100.0%
Modifier Letter
ValueCountFrequency (%)
ʻ 5
100.0%
Math Symbol
ValueCountFrequency (%)
+ 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 19557295
83.2%
Common 3938269
 
16.8%

Most frequent character per script

Latin
ValueCountFrequency (%)
a 2342309
 
12.0%
i 1683292
 
8.6%
t 1628350
 
8.3%
e 1586909
 
8.1%
r 1444280
 
7.4%
o 1263879
 
6.5%
n 1236327
 
6.3%
c 879180
 
4.5%
A 694612
 
3.6%
s 644321
 
3.3%
Other values (62) 6153836
31.5%
Common
ValueCountFrequency (%)
2533836
64.3%
, 1372561
34.9%
- 19466
 
0.5%
' 7365
 
0.2%
. 3951
 
0.1%
? 630
 
< 0.1%
* 122
 
< 0.1%
( 106
 
< 0.1%
) 106
 
< 0.1%
/ 103
 
< 0.1%
Other values (9) 23
 
< 0.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 23494048
> 99.9%
None 1504
 
< 0.1%
Modifier Letters 5
 
< 0.1%
Punctuation 4
 
< 0.1%
Latin Ext Additional 3
 
< 0.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
2533836
 
10.8%
a 2342309
 
10.0%
i 1683292
 
7.2%
t 1628350
 
6.9%
e 1586909
 
6.8%
r 1444280
 
6.1%
, 1372561
 
5.8%
o 1263879
 
5.4%
n 1236327
 
5.3%
c 879180
 
3.7%
Other values (59) 7523125
32.0%
None
ValueCountFrequency (%)
é 564
37.5%
ó 346
23.0%
ä 178
 
11.8%
í 176
 
11.7%
ê 104
 
6.9%
è 57
 
3.8%
ô 53
 
3.5%
ū 5
 
0.3%
ā 4
 
0.3%
Đ 3
 
0.2%
Other values (9) 14
 
0.9%
Modifier Letters
ValueCountFrequency (%)
ʻ 5
100.0%
Punctuation
ValueCountFrequency (%)
4
100.0%
Latin Ext Additional
ValueCountFrequency (%)
3
100.0%
Distinct100
Distinct (%)< 0.1%
Missing490
Missing (%)0.1%
Memory size4.6 MiB
2025-01-14T11:48:58.800020image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length40
Median length13
Mean length12.45328399
Min length4

Characters and Unicode

Total characters7483938
Distinct characters32
Distinct categories4 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique14 ?
Unique (%)< 0.1%

Sample

1st rowNorth America
2nd rowNorth America
3rd rowSouth America
4th rowNorth America
5th rowNorth America, North Atlantic Ocean
ValueCountFrequency (%)
america 390237
33.6%
north 367501
31.6%
africa 99818
 
8.6%
south 74300
 
6.4%
asia 66157
 
5.7%
ocean 58129
 
5.0%
atlantic 30063
 
2.6%
pacific 21536
 
1.9%
europe 14885
 
1.3%
unknown 13134
 
1.1%
Other values (9) 26436
 
2.3%
2025-01-14T11:48:58.931525image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
r 882026
11.8%
a 694459
9.3%
i 652876
8.7%
c 637049
8.5%
A 593130
 
7.9%
561235
 
7.5%
t 524859
 
7.0%
o 485683
 
6.5%
e 466196
 
6.2%
h 444531
 
5.9%
Other values (22) 1541894
20.6%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 5712548
76.3%
Uppercase Letter 1162018
 
15.5%
Space Separator 561235
 
7.5%
Other Punctuation 48137
 
0.6%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
r 882026
15.4%
a 694459
12.2%
i 652876
11.4%
c 637049
11.2%
t 524859
9.2%
o 485683
8.5%
e 466196
8.2%
h 444531
7.8%
m 390237
6.8%
n 137405
 
2.4%
Other values (9) 397227
7.0%
Uppercase Letter
ValueCountFrequency (%)
A 593130
51.0%
N 367501
31.6%
S 77030
 
6.6%
O 58344
 
5.0%
P 21536
 
1.9%
E 14885
 
1.3%
L 13133
 
1.1%
U 13133
 
1.1%
I 3326
 
0.3%
Other Punctuation
ValueCountFrequency (%)
, 47959
99.6%
? 142
 
0.3%
/ 36
 
0.1%
Space Separator
ValueCountFrequency (%)
561235
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 6874566
91.9%
Common 609372
 
8.1%

Most frequent character per script

Latin
ValueCountFrequency (%)
r 882026
12.8%
a 694459
10.1%
i 652876
9.5%
c 637049
9.3%
A 593130
8.6%
t 524859
7.6%
o 485683
7.1%
e 466196
6.8%
h 444531
6.5%
m 390237
 
5.7%
Other values (18) 1103520
16.1%
Common
ValueCountFrequency (%)
561235
92.1%
, 47959
 
7.9%
? 142
 
< 0.1%
/ 36
 
< 0.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 7483938
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
r 882026
11.8%
a 694459
9.3%
i 652876
8.7%
c 637049
8.5%
A 593130
 
7.9%
561235
 
7.5%
t 524859
 
7.0%
o 485683
 
6.5%
e 466196
 
6.2%
h 444531
 
5.9%
Other values (22) 1541894
20.6%

waterBody
Text

Missing 

Distinct1298
Distinct (%)2.1%
Missing539858
Missing (%)89.8%
Memory size4.6 MiB
2025-01-14T11:48:59.104633image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length79
Median length75
Mean length24.02534379
Min length6

Characters and Unicode

Total characters1479793
Distinct characters61
Distinct categories7 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique776 ?
Unique (%)1.3%

Sample

1st rowNorth Atlantic Ocean
2nd rowNorth Pacific Ocean, Bering Sea
3rd rowNorth Pacific Ocean
4th rowNorth Atlantic Ocean, Gulf Of Mexico
5th rowNorth Pacific Ocean
ValueCountFrequency (%)
ocean 58130
25.3%
north 49957
21.8%
atlantic 30063
13.1%
pacific 21536
 
9.4%
sea 8710
 
3.8%
of 8285
 
3.6%
gulf 7277
 
3.2%
mexico 6087
 
2.7%
south 3736
 
1.6%
indian 3443
 
1.5%
Other values (1047) 32100
14.0%
2025-01-14T11:48:59.364250image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
167731
11.3%
a 149650
 
10.1%
c 142458
 
9.6%
t 125319
 
8.5%
n 116971
 
7.9%
i 97425
 
6.6%
e 90274
 
6.1%
o 70318
 
4.8%
O 66128
 
4.5%
r 64946
 
4.4%
Other values (51) 388573
26.3%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 1060630
71.7%
Uppercase Letter 228943
 
15.5%
Space Separator 167731
 
11.3%
Other Punctuation 22340
 
1.5%
Dash Punctuation 147
 
< 0.1%
Open Punctuation 1
 
< 0.1%
Close Punctuation 1
 
< 0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
a 149650
14.1%
c 142458
13.4%
t 125319
11.8%
n 116971
11.0%
i 97425
9.2%
e 90274
8.5%
o 70318
6.6%
r 64946
6.1%
h 61407
5.8%
l 46029
 
4.3%
Other values (17) 95833
9.0%
Uppercase Letter
ValueCountFrequency (%)
O 66128
28.9%
N 50247
21.9%
A 32498
14.2%
P 22062
 
9.6%
S 16927
 
7.4%
G 7662
 
3.3%
C 7479
 
3.3%
M 7332
 
3.2%
B 7248
 
3.2%
I 3893
 
1.7%
Other values (15) 7467
 
3.3%
Other Punctuation
ValueCountFrequency (%)
, 22196
99.4%
? 67
 
0.3%
. 43
 
0.2%
' 33
 
0.1%
* 1
 
< 0.1%
Space Separator
ValueCountFrequency (%)
167731
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 147
100.0%
Open Punctuation
ValueCountFrequency (%)
( 1
100.0%
Close Punctuation
ValueCountFrequency (%)
) 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 1289573
87.1%
Common 190220
 
12.9%

Most frequent character per script

Latin
ValueCountFrequency (%)
a 149650
11.6%
c 142458
11.0%
t 125319
9.7%
n 116971
 
9.1%
i 97425
 
7.6%
e 90274
 
7.0%
o 70318
 
5.5%
O 66128
 
5.1%
r 64946
 
5.0%
h 61407
 
4.8%
Other values (42) 304677
23.6%
Common
ValueCountFrequency (%)
167731
88.2%
, 22196
 
11.7%
- 147
 
0.1%
? 67
 
< 0.1%
. 43
 
< 0.1%
' 33
 
< 0.1%
* 1
 
< 0.1%
( 1
 
< 0.1%
) 1
 
< 0.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 1479792
> 99.9%
None 1
 
< 0.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
167731
11.3%
a 149650
 
10.1%
c 142458
 
9.6%
t 125319
 
8.5%
n 116971
 
7.9%
i 97425
 
6.6%
e 90274
 
6.1%
o 70318
 
4.8%
O 66128
 
4.5%
r 64946
 
4.4%
Other values (50) 388572
26.3%
None
ValueCountFrequency (%)
ö 1
100.0%

islandGroup
Text

Missing 

Distinct68
Distinct (%)1.4%
Missing596682
Missing (%)99.2%
Memory size4.6 MiB
2025-01-14T11:48:59.447313image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length29
Median length24
Mean length13.28538478
Min length8

Characters and Unicode

Total characters63358
Distinct characters46
Distinct categories3 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique18 ?
Unique (%)0.4%

Sample

1st rowPribilof Islands
2nd rowPribilof Islands
3rd rowRyukyu Islands
4th rowPribilof Islands
5th rowBatan Islands
ValueCountFrequency (%)
islands 3374
40.8%
pribilof 1808
21.9%
moluccas 1194
 
14.4%
ryukyu 497
 
6.0%
babuyan 176
 
2.1%
channel 159
 
1.9%
batan 120
 
1.5%
nicobar 108
 
1.3%
bismarck 94
 
1.1%
yap 83
 
1.0%
Other values (66) 653
 
7.9%
2025-01-14T11:48:59.581008image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
s 8103
12.8%
l 6718
 
10.6%
a 6381
 
10.1%
n 4444
 
7.0%
i 4222
 
6.7%
d 3521
 
5.6%
3497
 
5.5%
I 3376
 
5.3%
o 3353
 
5.3%
c 2688
 
4.2%
Other values (36) 17055
26.9%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 51599
81.4%
Uppercase Letter 8262
 
13.0%
Space Separator 3497
 
5.5%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
s 8103
15.7%
l 6718
13.0%
a 6381
12.4%
n 4444
8.6%
i 4222
8.2%
d 3521
6.8%
o 3353
6.5%
c 2688
 
5.2%
u 2566
 
5.0%
r 2242
 
4.3%
Other values (14) 7361
14.3%
Uppercase Letter
ValueCountFrequency (%)
I 3376
40.9%
P 1814
22.0%
M 1235
 
14.9%
R 497
 
6.0%
B 412
 
5.0%
C 183
 
2.2%
S 153
 
1.9%
A 151
 
1.8%
N 122
 
1.5%
Y 83
 
1.0%
Other values (11) 236
 
2.9%
Space Separator
ValueCountFrequency (%)
3497
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 59861
94.5%
Common 3497
 
5.5%

Most frequent character per script

Latin
ValueCountFrequency (%)
s 8103
13.5%
l 6718
11.2%
a 6381
10.7%
n 4444
 
7.4%
i 4222
 
7.1%
d 3521
 
5.9%
I 3376
 
5.6%
o 3353
 
5.6%
c 2688
 
4.5%
u 2566
 
4.3%
Other values (35) 14489
24.2%
Common
ValueCountFrequency (%)
3497
100.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 63358
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
s 8103
12.8%
l 6718
 
10.6%
a 6381
 
10.1%
n 4444
 
7.0%
i 4222
 
6.7%
d 3521
 
5.6%
3497
 
5.5%
I 3376
 
5.3%
o 3353
 
5.3%
c 2688
 
4.2%
Other values (36) 17055
26.9%

island
Text

Missing 

Distinct345
Distinct (%)0.9%
Missing564842
Missing (%)93.9%
Memory size4.6 MiB
2025-01-14T11:48:59.759645image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length23
Median length21
Mean length8.146903767
Min length1

Characters and Unicode

Total characters298250
Distinct characters57
Distinct categories5 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique103 ?
Unique (%)0.3%

Sample

1st rowSt. Paul Island
2nd rowSt. Paul Island
3rd rowTrinidad
4th rowBorneo
5th rowCulion Island
ValueCountFrequency (%)
island 7184
14.8%
borneo 5932
 
12.2%
sumatra 3675
 
7.5%
luzon 3124
 
6.4%
java 3005
 
6.2%
celebes 2678
 
5.5%
trinidad 2605
 
5.4%
st 1818
 
3.7%
paul 1799
 
3.7%
honshu 1290
 
2.6%
Other values (366) 15576
32.0%
2025-01-14T11:49:00.005503image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
a 39564
13.3%
n 28846
 
9.7%
o 23778
 
8.0%
e 21049
 
7.1%
r 16512
 
5.5%
d 15796
 
5.3%
l 15656
 
5.2%
s 14538
 
4.9%
u 14063
 
4.7%
12077
 
4.0%
Other values (47) 96371
32.3%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 235808
79.1%
Uppercase Letter 48529
 
16.3%
Space Separator 12077
 
4.0%
Other Punctuation 1830
 
0.6%
Dash Punctuation 6
 
< 0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
a 39564
16.8%
n 28846
12.2%
o 23778
10.1%
e 21049
8.9%
r 16512
7.0%
d 15796
 
6.7%
l 15656
 
6.6%
s 14538
 
6.2%
u 14063
 
6.0%
i 11254
 
4.8%
Other values (16) 34752
14.7%
Uppercase Letter
ValueCountFrequency (%)
I 7839
16.2%
B 7137
14.7%
S 7123
14.7%
L 4203
8.7%
C 3825
7.9%
P 3689
7.6%
T 3258
6.7%
J 3022
 
6.2%
N 2160
 
4.5%
H 1664
 
3.4%
Other values (14) 4609
9.5%
Other Punctuation
ValueCountFrequency (%)
. 1817
99.3%
' 9
 
0.5%
? 2
 
0.1%
* 1
 
0.1%
, 1
 
0.1%
Space Separator
ValueCountFrequency (%)
12077
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 6
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 284337
95.3%
Common 13913
 
4.7%

Most frequent character per script

Latin
ValueCountFrequency (%)
a 39564
13.9%
n 28846
 
10.1%
o 23778
 
8.4%
e 21049
 
7.4%
r 16512
 
5.8%
d 15796
 
5.6%
l 15656
 
5.5%
s 14538
 
5.1%
u 14063
 
4.9%
i 11254
 
4.0%
Other values (40) 83281
29.3%
Common
ValueCountFrequency (%)
12077
86.8%
. 1817
 
13.1%
' 9
 
0.1%
- 6
 
< 0.1%
? 2
 
< 0.1%
* 1
 
< 0.1%
, 1
 
< 0.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 298250
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
a 39564
13.3%
n 28846
 
9.7%
o 23778
 
8.0%
e 21049
 
7.1%
r 16512
 
5.5%
d 15796
 
5.3%
l 15656
 
5.2%
s 14538
 
4.9%
u 14063
 
4.7%
12077
 
4.0%
Other values (47) 96371
32.3%

country
Text

Missing 

Distinct322
Distinct (%)0.1%
Missing6532
Missing (%)1.1%
Memory size4.6 MiB
2025-01-14T11:49:00.195865image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length44
Median length33
Mean length10.00060512
Min length1

Characters and Unicode

Total characters5949550
Distinct characters62
Distinct categories7 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique44 ?
Unique (%)< 0.1%

Sample

1st rowPanama
2nd rowUnited States
3rd rowVenezuela
4th rowMexico
5th rowUnited States
ValueCountFrequency (%)
united 229925
25.8%
states 225212
25.3%
mexico 34730
 
3.9%
panama 25482
 
2.9%
venezuela 24981
 
2.8%
canada 19301
 
2.2%
colombia 16624
 
1.9%
indonesia 14922
 
1.7%
south 12721
 
1.4%
brazil 12246
 
1.4%
Other values (303) 274156
30.8%
2025-01-14T11:49:00.449990image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
t 752064
12.6%
a 718314
12.1%
e 679639
11.4%
n 502391
 
8.4%
i 486974
 
8.2%
d 307154
 
5.2%
295381
 
5.0%
s 291189
 
4.9%
S 249895
 
4.2%
U 243908
 
4.1%
Other values (52) 1422641
23.9%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 4760116
80.0%
Uppercase Letter 886812
 
14.9%
Space Separator 295381
 
5.0%
Other Punctuation 7042
 
0.1%
Open Punctuation 91
 
< 0.1%
Close Punctuation 91
 
< 0.1%
Dash Punctuation 17
 
< 0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
t 752064
15.8%
a 718314
15.1%
e 679639
14.3%
n 502391
10.6%
i 486974
10.2%
d 307154
6.5%
s 291189
 
6.1%
o 210105
 
4.4%
l 114970
 
2.4%
r 100369
 
2.1%
Other values (17) 596947
12.5%
Uppercase Letter
ValueCountFrequency (%)
S 249895
28.2%
U 243908
27.5%
M 59605
 
6.7%
C 52665
 
5.9%
P 45129
 
5.1%
B 31355
 
3.5%
I 29005
 
3.3%
V 28088
 
3.2%
A 20638
 
2.3%
G 19590
 
2.2%
Other values (15) 106934
12.1%
Other Punctuation
ValueCountFrequency (%)
' 3154
44.8%
. 1894
26.9%
, 1725
24.5%
? 243
 
3.5%
/ 25
 
0.4%
* 1
 
< 0.1%
Space Separator
ValueCountFrequency (%)
295381
100.0%
Open Punctuation
ValueCountFrequency (%)
( 91
100.0%
Close Punctuation
ValueCountFrequency (%)
) 91
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 17
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 5646928
94.9%
Common 302622
 
5.1%

Most frequent character per script

Latin
ValueCountFrequency (%)
t 752064
13.3%
a 718314
12.7%
e 679639
12.0%
n 502391
8.9%
i 486974
8.6%
d 307154
 
5.4%
s 291189
 
5.2%
S 249895
 
4.4%
U 243908
 
4.3%
o 210105
 
3.7%
Other values (42) 1205295
21.3%
Common
ValueCountFrequency (%)
295381
97.6%
' 3154
 
1.0%
. 1894
 
0.6%
, 1725
 
0.6%
? 243
 
0.1%
( 91
 
< 0.1%
) 91
 
< 0.1%
/ 25
 
< 0.1%
- 17
 
< 0.1%
* 1
 
< 0.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 5949549
> 99.9%
None 1
 
< 0.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
t 752064
12.6%
a 718314
12.1%
e 679639
11.4%
n 502391
 
8.4%
i 486974
 
8.2%
d 307154
 
5.2%
295381
 
5.0%
s 291189
 
4.9%
S 249895
 
4.2%
U 243908
 
4.1%
Other values (51) 1422640
23.9%
None
ValueCountFrequency (%)
ç 1
100.0%

stateProvince
Text

Missing 

Distinct1750
Distinct (%)0.3%
Missing93954
Missing (%)15.6%
Memory size4.6 MiB
2025-01-14T11:49:00.652755image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length31
Median length27
Mean length9.156487625
Min length1

Characters and Unicode

Total characters4646890
Distinct characters75
Distinct categories9 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique314 ?
Unique (%)0.1%

Sample

1st rowBocas Del Toro
2nd rowUtah
3rd rowBolivar
4th rowOaxaca
5th rowNorth Carolina
ValueCountFrequency (%)
california 37958
 
5.7%
new 18698
 
2.8%
alaska 18000
 
2.7%
oregon 15112
 
2.3%
province 15077
 
2.2%
arizona 13072
 
1.9%
virginia 12189
 
1.8%
washington 12057
 
1.8%
texas 11524
 
1.7%
mexico 9875
 
1.5%
Other values (1720) 507096
75.6%
2025-01-14T11:49:00.921982image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
a 685721
14.8%
i 388351
 
8.4%
n 356516
 
7.7%
o 350614
 
7.5%
r 326855
 
7.0%
e 277944
 
6.0%
l 192295
 
4.1%
s 173201
 
3.7%
t 172374
 
3.7%
163161
 
3.5%
Other values (65) 1559858
33.6%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 3782086
81.4%
Uppercase Letter 683335
 
14.7%
Space Separator 163161
 
3.5%
Dash Punctuation 15111
 
0.3%
Other Punctuation 3190
 
0.1%
Decimal Number 4
 
< 0.1%
Math Symbol 1
 
< 0.1%
Open Punctuation 1
 
< 0.1%
Close Punctuation 1
 
< 0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
a 685721
18.1%
i 388351
10.3%
n 356516
9.4%
o 350614
9.3%
r 326855
8.6%
e 277944
 
7.3%
l 192295
 
5.1%
s 173201
 
4.6%
t 172374
 
4.6%
u 116650
 
3.1%
Other values (25) 741565
19.6%
Uppercase Letter
ValueCountFrequency (%)
C 96322
14.1%
A 66126
 
9.7%
N 63963
 
9.4%
M 54370
 
8.0%
S 44892
 
6.6%
T 39318
 
5.8%
P 37886
 
5.5%
B 35544
 
5.2%
W 30828
 
4.5%
O 27556
 
4.0%
Other values (16) 186530
27.3%
Other Punctuation
ValueCountFrequency (%)
' 2998
94.0%
? 159
 
5.0%
/ 21
 
0.7%
* 6
 
0.2%
. 5
 
0.2%
: 1
 
< 0.1%
Decimal Number
ValueCountFrequency (%)
1 2
50.0%
0 1
25.0%
8 1
25.0%
Space Separator
ValueCountFrequency (%)
163161
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 15111
100.0%
Math Symbol
ValueCountFrequency (%)
+ 1
100.0%
Open Punctuation
ValueCountFrequency (%)
( 1
100.0%
Close Punctuation
ValueCountFrequency (%)
) 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 4465421
96.1%
Common 181469
 
3.9%

Most frequent character per script

Latin
ValueCountFrequency (%)
a 685721
15.4%
i 388351
 
8.7%
n 356516
 
8.0%
o 350614
 
7.9%
r 326855
 
7.3%
e 277944
 
6.2%
l 192295
 
4.3%
s 173201
 
3.9%
t 172374
 
3.9%
u 116650
 
2.6%
Other values (51) 1424900
31.9%
Common
ValueCountFrequency (%)
163161
89.9%
- 15111
 
8.3%
' 2998
 
1.7%
? 159
 
0.1%
/ 21
 
< 0.1%
* 6
 
< 0.1%
. 5
 
< 0.1%
1 2
 
< 0.1%
0 1
 
< 0.1%
: 1
 
< 0.1%
Other values (4) 4
 
< 0.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 4645873
> 99.9%
None 1017
 
< 0.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
a 685721
14.8%
i 388351
 
8.4%
n 356516
 
7.7%
o 350614
 
7.5%
r 326855
 
7.0%
e 277944
 
6.0%
l 192295
 
4.1%
s 173201
 
3.7%
t 172374
 
3.7%
163161
 
3.5%
Other values (56) 1558841
33.6%
None
ValueCountFrequency (%)
é 367
36.1%
ó 346
34.0%
ä 178
17.5%
ê 92
 
9.0%
ô 30
 
2.9%
ç 1
 
0.1%
ã 1
 
0.1%
ō 1
 
0.1%
æ 1
 
0.1%

county
Text

Missing 

Distinct3194
Distinct (%)2.1%
Missing447402
Missing (%)74.4%
Memory size4.6 MiB
2025-01-14T11:49:01.118587image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length47
Median length27
Mean length13.46725393
Min length1

Characters and Unicode

Total characters2074617
Distinct characters79
Distinct categories9 ?
Distinct scripts2 ?
Distinct blocks5 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique663 ?
Unique (%)0.4%

Sample

1st rowCarteret
2nd rowCusco
3rd rowMonterey County
4th rowGalveston
5th rowTamana Ward
ValueCountFrequency (%)
county 80697
27.5%
district 13828
 
4.7%
islands 3705
 
1.3%
division 3460
 
1.2%
san 3315
 
1.1%
province 2619
 
0.9%
schoolcraft 2179
 
0.7%
mackenzie 1966
 
0.7%
lane 1935
 
0.7%
municipality 1862
 
0.6%
Other values (2969) 178313
60.7%
2025-01-14T11:49:01.381636image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
n 189818
 
9.1%
o 175404
 
8.5%
t 161467
 
7.8%
a 160330
 
7.7%
139830
 
6.7%
i 120188
 
5.8%
u 116014
 
5.6%
e 111686
 
5.4%
r 102364
 
4.9%
C 99007
 
4.8%
Other values (69) 698509
33.7%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 1630734
78.6%
Uppercase Letter 298270
 
14.4%
Space Separator 139830
 
6.7%
Dash Punctuation 4189
 
0.2%
Other Punctuation 1555
 
0.1%
Close Punctuation 13
 
< 0.1%
Open Punctuation 13
 
< 0.1%
Decimal Number 8
 
< 0.1%
Modifier Letter 5
 
< 0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
n 189818
11.6%
o 175404
10.8%
t 161467
9.9%
a 160330
9.8%
i 120188
 
7.4%
u 116014
 
7.1%
e 111686
 
6.8%
r 102364
 
6.3%
y 97639
 
6.0%
s 76836
 
4.7%
Other values (28) 318988
19.6%
Uppercase Letter
ValueCountFrequency (%)
C 99007
33.2%
D 27665
 
9.3%
S 18077
 
6.1%
M 17795
 
6.0%
B 15214
 
5.1%
P 13875
 
4.7%
A 12422
 
4.2%
L 11112
 
3.7%
G 10792
 
3.6%
W 8980
 
3.0%
Other values (17) 63331
21.2%
Other Punctuation
ValueCountFrequency (%)
' 1171
75.3%
. 192
 
12.3%
* 113
 
7.3%
? 56
 
3.6%
/ 21
 
1.4%
, 2
 
0.1%
Dash Punctuation
ValueCountFrequency (%)
- 4185
99.9%
4
 
0.1%
Decimal Number
ValueCountFrequency (%)
2 4
50.0%
4 4
50.0%
Space Separator
ValueCountFrequency (%)
139830
100.0%
Close Punctuation
ValueCountFrequency (%)
) 13
100.0%
Open Punctuation
ValueCountFrequency (%)
( 13
100.0%
Modifier Letter
ValueCountFrequency (%)
ʻ 5
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 1929004
93.0%
Common 145613
 
7.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
n 189818
 
9.8%
o 175404
 
9.1%
t 161467
 
8.4%
a 160330
 
8.3%
i 120188
 
6.2%
u 116014
 
6.0%
e 111686
 
5.8%
r 102364
 
5.3%
C 99007
 
5.1%
y 97639
 
5.1%
Other values (55) 595087
30.8%
Common
ValueCountFrequency (%)
139830
96.0%
- 4185
 
2.9%
' 1171
 
0.8%
. 192
 
0.1%
* 113
 
0.1%
? 56
 
< 0.1%
/ 21
 
< 0.1%
) 13
 
< 0.1%
( 13
 
< 0.1%
ʻ 5
 
< 0.1%
Other values (4) 14
 
< 0.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 2074120
> 99.9%
None 485
 
< 0.1%
Modifier Letters 5
 
< 0.1%
Punctuation 4
 
< 0.1%
Latin Ext Additional 3
 
< 0.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
n 189818
 
9.2%
o 175404
 
8.5%
t 161467
 
7.8%
a 160330
 
7.7%
139830
 
6.7%
i 120188
 
5.8%
u 116014
 
5.6%
e 111686
 
5.4%
r 102364
 
4.9%
C 99007
 
4.8%
Other values (54) 698012
33.7%
None
ValueCountFrequency (%)
é 197
40.6%
í 176
36.3%
è 57
 
11.8%
ô 23
 
4.7%
ê 12
 
2.5%
ū 5
 
1.0%
ā 4
 
0.8%
Đ 3
 
0.6%
ơ 3
 
0.6%
à 3
 
0.6%
Other values (2) 2
 
0.4%
Modifier Letters
ValueCountFrequency (%)
ʻ 5
100.0%
Punctuation
ValueCountFrequency (%)
4
100.0%
Latin Ext Additional
ValueCountFrequency (%)
3
100.0%

locality
Text

Missing 

Distinct86656
Distinct (%)15.3%
Missing35404
Missing (%)5.9%
Memory size4.6 MiB
2025-01-14T11:49:01.588968image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length294
Median length159
Mean length21.69044267
Min length1

Characters and Unicode

Total characters12277810
Distinct characters126
Distinct categories13 ?
Distinct scripts3 ?
Distinct blocks4 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique52764 ?
Unique (%)9.3%

Sample

1st rowTierra Oscura, 3.5 Km S. Tiger Key
2nd rowUinta Forest, Currant Creek
3rd rowkm. 125, 85 Km SSE El Dorado
4th rowTotontepec
5th rowAtlantic Beach, Atlantic Beach, 1/2 Mi E Of Triple S Pier.
ValueCountFrequency (%)
km 82857
 
3.9%
mi 82389
 
3.8%
of 34259
 
1.6%
n 30440
 
1.4%
river 28140
 
1.3%
s 27057
 
1.3%
e 26413
 
1.2%
w 26172
 
1.2%
island 23296
 
1.1%
san 23251
 
1.1%
Other values (42744) 1760837
82.1%
2025-01-14T11:49:01.880708image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
1579064
 
12.9%
a 1198873
 
9.8%
e 766610
 
6.2%
i 659790
 
5.4%
n 655818
 
5.3%
o 653029
 
5.3%
r 550115
 
4.5%
l 446951
 
3.6%
t 434393
 
3.5%
, 393002
 
3.2%
Other values (116) 4940165
40.2%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 7761808
63.2%
Uppercase Letter 2026861
 
16.5%
Space Separator 1579064
 
12.9%
Other Punctuation 489421
 
4.0%
Decimal Number 361074
 
2.9%
Open Punctuation 19801
 
0.2%
Close Punctuation 19779
 
0.2%
Dash Punctuation 15950
 
0.1%
Math Symbol 3991
 
< 0.1%
Connector Punctuation 54
 
< 0.1%
Other values (3) 7
 
< 0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
a 1198873
15.4%
e 766610
9.9%
i 659790
 
8.5%
n 655818
 
8.4%
o 653029
 
8.4%
r 550115
 
7.1%
l 446951
 
5.8%
t 434393
 
5.6%
s 353920
 
4.6%
u 324066
 
4.2%
Other values (49) 1718243
22.1%
Uppercase Letter
ValueCountFrequency (%)
S 227459
 
11.2%
M 200123
 
9.9%
C 146981
 
7.3%
N 141675
 
7.0%
K 124320
 
6.1%
R 117188
 
5.8%
B 112739
 
5.6%
P 108902
 
5.4%
E 107643
 
5.3%
W 98686
 
4.9%
Other values (21) 641145
31.6%
Other Punctuation
ValueCountFrequency (%)
, 393002
80.3%
. 71840
 
14.7%
; 9568
 
2.0%
' 6996
 
1.4%
/ 2669
 
0.5%
: 2390
 
0.5%
" 1272
 
0.3%
? 612
 
0.1%
& 491
 
0.1%
# 388
 
0.1%
Other values (3) 193
 
< 0.1%
Decimal Number
ValueCountFrequency (%)
1 75042
20.8%
2 56929
15.8%
5 50938
14.1%
0 37299
10.3%
3 35827
9.9%
4 29576
 
8.2%
6 25291
 
7.0%
8 19038
 
5.3%
7 17890
 
5.0%
9 13244
 
3.7%
Math Symbol
ValueCountFrequency (%)
= 3747
93.9%
+ 184
 
4.6%
~ 60
 
1.5%
Open Punctuation
ValueCountFrequency (%)
( 10013
50.6%
[ 9788
49.4%
Close Punctuation
ValueCountFrequency (%)
) 9993
50.5%
] 9786
49.5%
Space Separator
ValueCountFrequency (%)
1579064
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 15950
100.0%
Connector Punctuation
ValueCountFrequency (%)
_ 54
100.0%
Other Symbol
ValueCountFrequency (%)
° 3
100.0%
Final Punctuation
ValueCountFrequency (%)
2
100.0%
Other Number
ValueCountFrequency (%)
¼ 2
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 9788654
79.7%
Common 2489141
 
20.3%
Cyrillic 15
 
< 0.1%

Most frequent character per script

Latin
ValueCountFrequency (%)
a 1198873
 
12.2%
e 766610
 
7.8%
i 659790
 
6.7%
n 655818
 
6.7%
o 653029
 
6.7%
r 550115
 
5.6%
l 446951
 
4.6%
t 434393
 
4.4%
s 353920
 
3.6%
u 324066
 
3.3%
Other values (68) 3745089
38.3%
Common
ValueCountFrequency (%)
1579064
63.4%
, 393002
 
15.8%
1 75042
 
3.0%
. 71840
 
2.9%
2 56929
 
2.3%
5 50938
 
2.0%
0 37299
 
1.5%
3 35827
 
1.4%
4 29576
 
1.2%
6 25291
 
1.0%
Other values (26) 134333
 
5.4%
Cyrillic
ValueCountFrequency (%)
л 3
20.0%
к 2
13.3%
т 1
 
6.7%
і 1
 
6.7%
ө 1
 
6.7%
ы 1
 
6.7%
а 1
 
6.7%
м 1
 
6.7%
н 1
 
6.7%
е 1
 
6.7%
Other values (2) 2
13.3%

Most occurring blocks

ValueCountFrequency (%)
ASCII 12277174
> 99.9%
None 619
 
< 0.1%
Cyrillic 15
 
< 0.1%
Punctuation 2
 
< 0.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
1579064
 
12.9%
a 1198873
 
9.8%
e 766610
 
6.2%
i 659790
 
5.4%
n 655818
 
5.3%
o 653029
 
5.3%
r 550115
 
4.5%
l 446951
 
3.6%
t 434393
 
3.5%
, 393002
 
3.2%
Other values (75) 4939529
40.2%
None
ValueCountFrequency (%)
é 382
61.7%
è 107
 
17.3%
ø 19
 
3.1%
ñ 19
 
3.1%
á 11
 
1.8%
ö 11
 
1.8%
ã 7
 
1.1%
ü 7
 
1.1%
ó 7
 
1.1%
Œ 6
 
1.0%
Other values (18) 43
 
6.9%
Cyrillic
ValueCountFrequency (%)
л 3
20.0%
к 2
13.3%
т 1
 
6.7%
і 1
 
6.7%
ө 1
 
6.7%
ы 1
 
6.7%
а 1
 
6.7%
м 1
 
6.7%
н 1
 
6.7%
е 1
 
6.7%
Other values (2) 2
13.3%
Punctuation
ValueCountFrequency (%)
2
100.0%
Distinct1508
Distinct (%)1.4%
Missing496901
Missing (%)82.6%
Memory size4.6 MiB
2025-01-14T11:49:02.089701image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length6
Median length5
Mean length5.297771401
Min length3

Characters and Unicode

Total characters553882
Distinct characters12
Distinct categories3 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique432 ?
Unique (%)0.4%

Sample

1st row1032.0
2nd row1006.0
3rd row545.0
4th row2134.0
5th row130.0
ValueCountFrequency (%)
155.0 2555
 
2.4%
150.0 2079
 
2.0%
975.0 1931
 
1.8%
1829.0 1925
 
1.8%
1524.0 1732
 
1.7%
1219.0 1705
 
1.6%
2438.0 1490
 
1.4%
2134.0 1369
 
1.3%
914.0 1339
 
1.3%
610.0 1184
 
1.1%
Other values (1495) 87241
83.4%
2025-01-14T11:49:02.358466image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
0 160391
29.0%
. 104550
18.9%
1 64341
11.6%
2 42599
 
7.7%
5 40675
 
7.3%
3 28323
 
5.1%
4 25625
 
4.6%
7 24670
 
4.5%
9 21989
 
4.0%
6 20905
 
3.8%
Other values (2) 19814
 
3.6%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 449325
81.1%
Other Punctuation 104550
 
18.9%
Dash Punctuation 7
 
< 0.1%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
0 160391
35.7%
1 64341
14.3%
2 42599
 
9.5%
5 40675
 
9.1%
3 28323
 
6.3%
4 25625
 
5.7%
7 24670
 
5.5%
9 21989
 
4.9%
6 20905
 
4.7%
8 19807
 
4.4%
Other Punctuation
ValueCountFrequency (%)
. 104550
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 7
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 553882
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
0 160391
29.0%
. 104550
18.9%
1 64341
11.6%
2 42599
 
7.7%
5 40675
 
7.3%
3 28323
 
5.1%
4 25625
 
4.6%
7 24670
 
4.5%
9 21989
 
4.0%
6 20905
 
3.8%
Other values (2) 19814
 
3.6%

Most occurring blocks

ValueCountFrequency (%)
ASCII 553882
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
0 160391
29.0%
. 104550
18.9%
1 64341
11.6%
2 42599
 
7.7%
5 40675
 
7.3%
3 28323
 
5.1%
4 25625
 
4.6%
7 24670
 
4.5%
9 21989
 
4.0%
6 20905
 
3.8%
Other values (2) 19814
 
3.6%
Distinct115
Distinct (%)3.0%
Missing597572
Missing (%)99.4%
Memory size4.6 MiB
2025-01-14T11:49:02.473479image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length6
Median length5
Mean length5.129156999
Min length3

Characters and Unicode

Total characters19896
Distinct characters11
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique27 ?
Unique (%)0.7%

Sample

1st row1951.0
2nd row2835.0
3rd row61.0
4th row2200.0
5th row1500.0
ValueCountFrequency (%)
76.0 652
16.8%
1500.0 427
 
11.0%
152.0 278
 
7.2%
914.0 240
 
6.2%
2200.0 237
 
6.1%
30.0 175
 
4.5%
2010.0 156
 
4.0%
488.0 143
 
3.7%
400.0 138
 
3.6%
305.0 120
 
3.1%
Other values (105) 1313
33.8%
2025-01-14T11:49:02.641114image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
0 6753
33.9%
. 3879
19.5%
1 1956
 
9.8%
2 1621
 
8.1%
5 1289
 
6.5%
6 978
 
4.9%
7 921
 
4.6%
4 876
 
4.4%
3 675
 
3.4%
8 516
 
2.6%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 16017
80.5%
Other Punctuation 3879
 
19.5%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
0 6753
42.2%
1 1956
 
12.2%
2 1621
 
10.1%
5 1289
 
8.0%
6 978
 
6.1%
7 921
 
5.8%
4 876
 
5.5%
3 675
 
4.2%
8 516
 
3.2%
9 432
 
2.7%
Other Punctuation
ValueCountFrequency (%)
. 3879
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 19896
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
0 6753
33.9%
. 3879
19.5%
1 1956
 
9.8%
2 1621
 
8.1%
5 1289
 
6.5%
6 978
 
4.9%
7 921
 
4.6%
4 876
 
4.4%
3 675
 
3.4%
8 516
 
2.6%

Most occurring blocks

ValueCountFrequency (%)
ASCII 19896
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
0 6753
33.9%
. 3879
19.5%
1 1956
 
9.8%
2 1621
 
8.1%
5 1289
 
6.5%
6 978
 
4.9%
7 921
 
4.6%
4 876
 
4.4%
3 675
 
3.4%
8 516
 
2.6%

verbatimElevation
Text

Missing 

Distinct29
Distinct (%)1.8%
Missing599861
Missing (%)99.7%
Memory size4.6 MiB
2025-01-14T11:49:02.712726image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length33
Median length8
Mean length8.518867925
Min length2

Characters and Unicode

Total characters13545
Distinct characters43
Distinct categories9 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique9 ?
Unique (%)0.6%

Sample

1st rowsea level
2nd rowsealevel
3rd rowsealevel
4th rowsealevel
5th rowsee Osgood 1909:214
ValueCountFrequency (%)
sealevel 1096
46.9%
sea 280
 
12.0%
level 277
 
11.9%
ft 143
 
6.1%
104
 
4.5%
100 81
 
3.5%
m 59
 
2.5%
near 32
 
1.4%
below 30
 
1.3%
3 28
 
1.2%
Other values (33) 206
 
8.8%
2025-01-14T11:49:02.843892image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
e 4198
31.0%
l 2792
20.6%
a 1481
 
10.9%
s 1380
 
10.2%
v 1376
 
10.2%
746
 
5.5%
0 314
 
2.3%
t 156
 
1.2%
1 152
 
1.1%
f 143
 
1.1%
Other values (33) 807
 
6.0%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 12018
88.7%
Space Separator 746
 
5.5%
Decimal Number 555
 
4.1%
Math Symbol 110
 
0.8%
Uppercase Letter 87
 
0.6%
Dash Punctuation 22
 
0.2%
Other Punctuation 5
 
< 0.1%
Open Punctuation 1
 
< 0.1%
Close Punctuation 1
 
< 0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
e 4198
34.9%
l 2792
23.2%
a 1481
 
12.3%
s 1380
 
11.5%
v 1376
 
11.4%
t 156
 
1.3%
f 143
 
1.2%
c 92
 
0.8%
m 62
 
0.5%
r 61
 
0.5%
Other values (12) 277
 
2.3%
Decimal Number
ValueCountFrequency (%)
0 314
56.6%
1 152
27.4%
3 52
 
9.4%
5 16
 
2.9%
2 10
 
1.8%
7 6
 
1.1%
9 3
 
0.5%
4 2
 
0.4%
Uppercase Letter
ValueCountFrequency (%)
P 28
32.2%
G 28
32.2%
S 28
32.2%
M 1
 
1.1%
K 1
 
1.1%
O 1
 
1.1%
Other Punctuation
ValueCountFrequency (%)
. 3
60.0%
: 2
40.0%
Space Separator
ValueCountFrequency (%)
746
100.0%
Math Symbol
ValueCountFrequency (%)
< 110
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 22
100.0%
Open Punctuation
ValueCountFrequency (%)
( 1
100.0%
Close Punctuation
ValueCountFrequency (%)
) 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 12105
89.4%
Common 1440
 
10.6%

Most frequent character per script

Latin
ValueCountFrequency (%)
e 4198
34.7%
l 2792
23.1%
a 1481
 
12.2%
s 1380
 
11.4%
v 1376
 
11.4%
t 156
 
1.3%
f 143
 
1.2%
c 92
 
0.8%
m 62
 
0.5%
r 61
 
0.5%
Other values (18) 364
 
3.0%
Common
ValueCountFrequency (%)
746
51.8%
0 314
21.8%
1 152
 
10.6%
< 110
 
7.6%
3 52
 
3.6%
- 22
 
1.5%
5 16
 
1.1%
2 10
 
0.7%
7 6
 
0.4%
9 3
 
0.2%
Other values (5) 9
 
0.6%

Most occurring blocks

ValueCountFrequency (%)
ASCII 13545
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
e 4198
31.0%
l 2792
20.6%
a 1481
 
10.9%
s 1380
 
10.2%
v 1376
 
10.2%
746
 
5.5%
0 314
 
2.3%
t 156
 
1.2%
1 152
 
1.1%
f 143
 
1.1%
Other values (33) 807
 
6.0%

minimumDepthInMeters
Text

Missing 

Distinct2
Distinct (%)66.7%
Missing601448
Missing (%)> 99.9%
Memory size4.6 MiB
2025-01-14T11:49:02.888606image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length6
Median length6
Mean length5.666666667
Min length5

Characters and Unicode

Total characters17
Distinct characters7
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique1 ?
Unique (%)33.3%

Sample

1st row853.0
2nd row1600.0
3rd row1600.0
ValueCountFrequency (%)
1600.0 2
66.7%
853.0 1
33.3%
2025-01-14T11:49:02.985944image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
0 7
41.2%
. 3
17.6%
1 2
 
11.8%
6 2
 
11.8%
8 1
 
5.9%
5 1
 
5.9%
3 1
 
5.9%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 14
82.4%
Other Punctuation 3
 
17.6%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
0 7
50.0%
1 2
 
14.3%
6 2
 
14.3%
8 1
 
7.1%
5 1
 
7.1%
3 1
 
7.1%
Other Punctuation
ValueCountFrequency (%)
. 3
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 17
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
0 7
41.2%
. 3
17.6%
1 2
 
11.8%
6 2
 
11.8%
8 1
 
5.9%
5 1
 
5.9%
3 1
 
5.9%

Most occurring blocks

ValueCountFrequency (%)
ASCII 17
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
0 7
41.2%
. 3
17.6%
1 2
 
11.8%
6 2
 
11.8%
8 1
 
5.9%
5 1
 
5.9%
3 1
 
5.9%

decimalLatitude
Text

Missing 

Distinct10264
Distinct (%)6.7%
Missing448433
Missing (%)74.6%
Memory size4.6 MiB
2025-01-14T11:49:03.180721image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length9
Median length8
Mean length5.04639977
Min length3

Characters and Unicode

Total characters772190
Distinct characters12
Distinct categories3 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique4985 ?
Unique (%)3.3%

Sample

1st row5.98
2nd row34.68
3rd row31.5011
4th row29.37
5th row34.4863
ValueCountFrequency (%)
5.3 1716
 
1.1%
2.78 1090
 
0.7%
5.67 1073
 
0.7%
0.88 979
 
0.6%
3.65 946
 
0.6%
8.83 814
 
0.5%
10.53 811
 
0.5%
3.17 798
 
0.5%
8.17 759
 
0.5%
7.32 742
 
0.5%
Other values (9276) 143290
93.6%
2025-01-14T11:49:03.450020image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
. 153018
19.8%
3 85567
11.1%
2 76884
10.0%
1 68690
8.9%
5 67202
8.7%
8 61499
8.0%
7 57931
 
7.5%
6 42374
 
5.5%
0 42282
 
5.5%
9 41004
 
5.3%
Other values (2) 75739
9.8%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 581467
75.3%
Other Punctuation 153018
 
19.8%
Dash Punctuation 37705
 
4.9%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
3 85567
14.7%
2 76884
13.2%
1 68690
11.8%
5 67202
11.6%
8 61499
10.6%
7 57931
10.0%
6 42374
7.3%
0 42282
7.3%
9 41004
7.1%
4 38034
6.5%
Other Punctuation
ValueCountFrequency (%)
. 153018
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 37705
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 772190
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
. 153018
19.8%
3 85567
11.1%
2 76884
10.0%
1 68690
8.9%
5 67202
8.7%
8 61499
8.0%
7 57931
 
7.5%
6 42374
 
5.5%
0 42282
 
5.5%
9 41004
 
5.3%
Other values (2) 75739
9.8%

Most occurring blocks

ValueCountFrequency (%)
ASCII 772190
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
. 153018
19.8%
3 85567
11.1%
2 76884
10.0%
1 68690
8.9%
5 67202
8.7%
8 61499
8.0%
7 57931
 
7.5%
6 42374
 
5.5%
0 42282
 
5.5%
9 41004
 
5.3%
Other values (2) 75739
9.8%

decimalLongitude
Text

Missing 

Distinct11880
Distinct (%)7.8%
Missing448433
Missing (%)74.6%
Memory size4.6 MiB
2025-01-14T11:49:03.660050image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length10
Median length9
Mean length5.651550798
Min length3

Characters and Unicode

Total characters864789
Distinct characters12
Distinct categories3 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique5910 ?
Unique (%)3.9%

Sample

1st row-61.43
2nd row-76.7
3rd row65.8453
4th row-94.82
5th row74.6026
ValueCountFrequency (%)
66.22 1723
 
1.1%
16.42 1090
 
0.7%
127.68 955
 
0.6%
0.2 930
 
0.6%
70.5 790
 
0.5%
71.95 739
 
0.5%
79.62 722
 
0.5%
0.22 681
 
0.4%
0.97 651
 
0.4%
66.18 629
 
0.4%
Other values (11070) 144108
94.2%
2025-01-14T11:49:03.927483image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
. 153018
17.7%
- 87916
10.2%
2 86605
10.0%
1 81267
9.4%
7 80986
9.4%
3 68421
7.9%
6 62202
7.2%
8 58615
 
6.8%
5 58531
 
6.8%
0 50551
 
5.8%
Other values (2) 76677
8.9%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 623855
72.1%
Other Punctuation 153018
 
17.7%
Dash Punctuation 87916
 
10.2%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
2 86605
13.9%
1 81267
13.0%
7 80986
13.0%
3 68421
11.0%
6 62202
10.0%
8 58615
9.4%
5 58531
9.4%
0 50551
8.1%
4 38663
6.2%
9 38014
6.1%
Other Punctuation
ValueCountFrequency (%)
. 153018
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 87916
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 864789
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
. 153018
17.7%
- 87916
10.2%
2 86605
10.0%
1 81267
9.4%
7 80986
9.4%
3 68421
7.9%
6 62202
7.2%
8 58615
 
6.8%
5 58531
 
6.8%
0 50551
 
5.8%
Other values (2) 76677
8.9%

Most occurring blocks

ValueCountFrequency (%)
ASCII 864789
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
. 153018
17.7%
- 87916
10.2%
2 86605
10.0%
1 81267
9.4%
7 80986
9.4%
3 68421
7.9%
6 62202
7.2%
8 58615
 
6.8%
5 58531
 
6.8%
0 50551
 
5.8%
Other values (2) 76677
8.9%

geodeticDatum
Text

Missing 

Distinct2
Distinct (%)< 0.1%
Missing594543
Missing (%)98.9%
Memory size4.6 MiB
2025-01-14T11:49:03.988523image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length18
Median length18
Mean length17.99681529
Min length7

Characters and Unicode

Total characters124322
Distinct characters19
Distinct categories7 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowWGS 84 (EPSG:4326)
2nd rowWGS 84 (EPSG:4326)
3rd rowWGS 84 (EPSG:4326)
4th rowWGS 84 (EPSG:4326)
5th rowWGS 84 (EPSG:4326)
ValueCountFrequency (%)
wgs 6906
33.3%
84 6906
33.3%
epsg:4326 6906
33.3%
unknown 2
 
< 0.1%
2025-01-14T11:49:04.094238image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
G 13812
11.1%
S 13812
11.1%
13812
11.1%
4 13812
11.1%
W 6906
 
5.6%
) 6906
 
5.6%
6 6906
 
5.6%
2 6906
 
5.6%
3 6906
 
5.6%
: 6906
 
5.6%
Other values (9) 27638
22.2%

Most occurring categories

ValueCountFrequency (%)
Uppercase Letter 48342
38.9%
Decimal Number 41436
33.3%
Space Separator 13812
 
11.1%
Close Punctuation 6906
 
5.6%
Other Punctuation 6906
 
5.6%
Open Punctuation 6906
 
5.6%
Lowercase Letter 14
 
< 0.1%

Most frequent character per category

Uppercase Letter
ValueCountFrequency (%)
G 13812
28.6%
S 13812
28.6%
W 6906
14.3%
P 6906
14.3%
E 6906
14.3%
Decimal Number
ValueCountFrequency (%)
4 13812
33.3%
6 6906
16.7%
2 6906
16.7%
3 6906
16.7%
8 6906
16.7%
Lowercase Letter
ValueCountFrequency (%)
n 6
42.9%
u 2
 
14.3%
k 2
 
14.3%
o 2
 
14.3%
w 2
 
14.3%
Space Separator
ValueCountFrequency (%)
13812
100.0%
Close Punctuation
ValueCountFrequency (%)
) 6906
100.0%
Other Punctuation
ValueCountFrequency (%)
: 6906
100.0%
Open Punctuation
ValueCountFrequency (%)
( 6906
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 75966
61.1%
Latin 48356
38.9%

Most frequent character per script

Latin
ValueCountFrequency (%)
G 13812
28.6%
S 13812
28.6%
W 6906
14.3%
P 6906
14.3%
E 6906
14.3%
n 6
 
< 0.1%
u 2
 
< 0.1%
k 2
 
< 0.1%
o 2
 
< 0.1%
w 2
 
< 0.1%
Common
ValueCountFrequency (%)
13812
18.2%
4 13812
18.2%
) 6906
9.1%
6 6906
9.1%
2 6906
9.1%
3 6906
9.1%
: 6906
9.1%
( 6906
9.1%
8 6906
9.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 124322
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
G 13812
11.1%
S 13812
11.1%
13812
11.1%
4 13812
11.1%
W 6906
 
5.6%
) 6906
 
5.6%
6 6906
 
5.6%
2 6906
 
5.6%
3 6906
 
5.6%
: 6906
 
5.6%
Other values (9) 27638
22.2%

verbatimLatitude
Text

Missing 

Distinct11921
Distinct (%)8.8%
Missing466631
Missing (%)77.6%
Memory size4.6 MiB
2025-01-14T11:49:04.266843image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length20
Median length10
Mean length9.74341344
Min length3

Characters and Unicode

Total characters1313607
Distinct characters31
Distinct categories7 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique6353 ?
Unique (%)4.7%

Sample

1st row05 59 -- N
2nd row34 41 4- N
3rd row29 22 1- N
4th row02 37 -- N
5th row28 39 -- S
ValueCountFrequency (%)
106698
21.4%
n 93430
18.7%
s 28456
 
5.7%
10 13100
 
2.6%
09 10526
 
2.1%
08 8990
 
1.8%
05 8987
 
1.8%
07 8164
 
1.6%
30 8001
 
1.6%
06 7215
 
1.4%
Other values (2490) 205546
41.2%
2025-01-14T11:49:04.509527image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
364293
27.7%
- 242090
18.4%
0 121688
 
9.3%
N 102689
 
7.8%
1 82952
 
6.3%
2 72522
 
5.5%
3 69271
 
5.3%
5 58613
 
4.5%
4 50109
 
3.8%
9 32877
 
2.5%
Other values (21) 116503
 
8.9%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 568428
43.3%
Space Separator 364293
27.7%
Dash Punctuation 242090
18.4%
Uppercase Letter 134294
 
10.2%
Other Punctuation 3746
 
0.3%
Lowercase Letter 412
 
< 0.1%
Other Symbol 344
 
< 0.1%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
0 121688
21.4%
1 82952
14.6%
2 72522
12.8%
3 69271
12.2%
5 58613
10.3%
4 50109
8.8%
9 32877
 
5.8%
8 28469
 
5.0%
7 27206
 
4.8%
6 24721
 
4.3%
Uppercase Letter
ValueCountFrequency (%)
N 102689
76.5%
S 31520
 
23.5%
W 74
 
0.1%
E 4
 
< 0.1%
M 3
 
< 0.1%
O 3
 
< 0.1%
A 1
 
< 0.1%
Other Punctuation
ValueCountFrequency (%)
. 2935
78.4%
' 801
 
21.4%
? 6
 
0.2%
* 2
 
0.1%
; 1
 
< 0.1%
" 1
 
< 0.1%
Lowercase Letter
ValueCountFrequency (%)
d 136
33.0%
g 136
33.0%
e 136
33.0%
c 2
 
0.5%
a 2
 
0.5%
Space Separator
ValueCountFrequency (%)
364293
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 242090
100.0%
Other Symbol
ValueCountFrequency (%)
° 344
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 1178901
89.7%
Latin 134706
 
10.3%

Most frequent character per script

Common
ValueCountFrequency (%)
364293
30.9%
- 242090
20.5%
0 121688
 
10.3%
1 82952
 
7.0%
2 72522
 
6.2%
3 69271
 
5.9%
5 58613
 
5.0%
4 50109
 
4.3%
9 32877
 
2.8%
8 28469
 
2.4%
Other values (9) 56017
 
4.8%
Latin
ValueCountFrequency (%)
N 102689
76.2%
S 31520
 
23.4%
d 136
 
0.1%
g 136
 
0.1%
e 136
 
0.1%
W 74
 
0.1%
E 4
 
< 0.1%
M 3
 
< 0.1%
O 3
 
< 0.1%
c 2
 
< 0.1%
Other values (2) 3
 
< 0.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 1313263
> 99.9%
None 344
 
< 0.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
364293
27.7%
- 242090
18.4%
0 121688
 
9.3%
N 102689
 
7.8%
1 82952
 
6.3%
2 72522
 
5.5%
3 69271
 
5.3%
5 58613
 
4.5%
4 50109
 
3.8%
9 32877
 
2.5%
Other values (20) 116159
 
8.8%
None
ValueCountFrequency (%)
° 344
100.0%

verbatimLongitude
Text

Missing 

Distinct13154
Distinct (%)9.8%
Missing466723
Missing (%)77.6%
Memory size4.6 MiB
2025-01-14T11:49:04.689913image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length22
Median length11
Mean length10.73756012
Min length3

Characters and Unicode

Total characters1446650
Distinct characters28
Distinct categories7 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique7533 ?
Unique (%)5.6%

Sample

1st row061 26 -- W
2nd row076 42 1- W
3rd row094 49 4- W
4th row066 19 -- W
5th row020 15 -- E
ValueCountFrequency (%)
106940
21.4%
w 73770
 
14.8%
e 47858
 
9.6%
000 6910
 
1.4%
00 4768
 
1.0%
46 4542
 
0.9%
002 4510
 
0.9%
13 4306
 
0.9%
001 3732
 
0.7%
066 3560
 
0.7%
Other values (2805) 238004
47.7%
2025-01-14T11:49:04.945918image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
364172
25.2%
- 243702
16.8%
0 216851
15.0%
1 88638
 
6.1%
W 80745
 
5.6%
2 78239
 
5.4%
3 58623
 
4.1%
5 53520
 
3.7%
E 53220
 
3.7%
4 52072
 
3.6%
Other values (18) 156868
10.8%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 700323
48.4%
Space Separator 364172
25.2%
Dash Punctuation 243702
 
16.8%
Uppercase Letter 133992
 
9.3%
Other Punctuation 3709
 
0.3%
Lowercase Letter 408
 
< 0.1%
Other Symbol 344
 
< 0.1%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
0 216851
31.0%
1 88638
12.7%
2 78239
 
11.2%
3 58623
 
8.4%
5 53520
 
7.6%
4 52072
 
7.4%
6 50424
 
7.2%
7 43836
 
6.3%
8 32353
 
4.6%
9 25767
 
3.7%
Uppercase Letter
ValueCountFrequency (%)
W 80745
60.3%
E 53220
39.7%
N 16
 
< 0.1%
S 9
 
< 0.1%
O 1
 
< 0.1%
C 1
 
< 0.1%
Other Punctuation
ValueCountFrequency (%)
. 2897
78.1%
' 801
 
21.6%
? 7
 
0.2%
* 2
 
0.1%
" 1
 
< 0.1%
; 1
 
< 0.1%
Lowercase Letter
ValueCountFrequency (%)
d 136
33.3%
e 136
33.3%
g 136
33.3%
Space Separator
ValueCountFrequency (%)
364172
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 243702
100.0%
Other Symbol
ValueCountFrequency (%)
° 344
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 1312250
90.7%
Latin 134400
 
9.3%

Most frequent character per script

Common
ValueCountFrequency (%)
364172
27.8%
- 243702
18.6%
0 216851
16.5%
1 88638
 
6.8%
2 78239
 
6.0%
3 58623
 
4.5%
5 53520
 
4.1%
4 52072
 
4.0%
6 50424
 
3.8%
7 43836
 
3.3%
Other values (9) 62173
 
4.7%
Latin
ValueCountFrequency (%)
W 80745
60.1%
E 53220
39.6%
d 136
 
0.1%
e 136
 
0.1%
g 136
 
0.1%
N 16
 
< 0.1%
S 9
 
< 0.1%
O 1
 
< 0.1%
C 1
 
< 0.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 1446306
> 99.9%
None 344
 
< 0.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
364172
25.2%
- 243702
16.8%
0 216851
15.0%
1 88638
 
6.1%
W 80745
 
5.6%
2 78239
 
5.4%
3 58623
 
4.1%
5 53520
 
3.7%
E 53220
 
3.7%
4 52072
 
3.6%
Other values (17) 156524
10.8%
None
ValueCountFrequency (%)
° 344
100.0%
Distinct4
Distinct (%)< 0.1%
Missing468202
Missing (%)77.8%
Memory size4.6 MiB
2025-01-14T11:49:05.003631image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length23
Median length23
Mean length22.96475771
Min length3

Characters and Unicode

Total characters3060031
Distinct characters22
Distinct categories3 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique1 ?
Unique (%)< 0.1%

Sample

1st rowDegrees Minutes Seconds
2nd rowDegrees Minutes Seconds
3rd rowDegrees Minutes Seconds
4th rowDegrees Minutes Seconds
5th rowDegrees Minutes Seconds
ValueCountFrequency (%)
degrees 133004
33.3%
minutes 133003
33.3%
seconds 133003
33.3%
utm 192
 
< 0.1%
unknown 53
 
< 0.1%
decimal 1
 
< 0.1%
2025-01-14T11:49:05.119916image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
e 665019
21.7%
s 399010
13.0%
n 266165
 
8.7%
266007
 
8.7%
M 133195
 
4.4%
o 133056
 
4.3%
D 133004
 
4.3%
c 133004
 
4.3%
g 133004
 
4.3%
r 133004
 
4.3%
Other values (12) 665563
21.8%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 2394385
78.2%
Uppercase Letter 399639
 
13.1%
Space Separator 266007
 
8.7%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
e 665019
27.8%
s 399010
16.7%
n 266165
11.1%
o 133056
 
5.6%
c 133004
 
5.6%
g 133004
 
5.6%
r 133004
 
5.6%
i 133004
 
5.6%
d 133004
 
5.6%
t 133003
 
5.6%
Other values (6) 133112
 
5.6%
Uppercase Letter
ValueCountFrequency (%)
M 133195
33.3%
D 133004
33.3%
S 133003
33.3%
U 245
 
0.1%
T 192
 
< 0.1%
Space Separator
ValueCountFrequency (%)
266007
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 2794024
91.3%
Common 266007
 
8.7%

Most frequent character per script

Latin
ValueCountFrequency (%)
e 665019
23.8%
s 399010
14.3%
n 266165
9.5%
M 133195
 
4.8%
o 133056
 
4.8%
D 133004
 
4.8%
c 133004
 
4.8%
g 133004
 
4.8%
r 133004
 
4.8%
i 133004
 
4.8%
Other values (11) 532559
19.1%
Common
ValueCountFrequency (%)
266007
100.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 3060031
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
e 665019
21.7%
s 399010
13.0%
n 266165
 
8.7%
266007
 
8.7%
M 133195
 
4.4%
o 133056
 
4.3%
D 133004
 
4.3%
c 133004
 
4.3%
g 133004
 
4.3%
r 133004
 
4.3%
Other values (12) 665563
21.8%

georeferenceProtocol
Text

Missing 

Distinct8
Distinct (%)0.1%
Missing592196
Missing (%)98.5%
Memory size4.6 MiB
2025-01-14T11:49:05.173282image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length26
Median length12
Mean length10.66731496
Min length3

Characters and Unicode

Total characters98726
Distinct characters32
Distinct categories5 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique1 ?
Unique (%)< 0.1%

Sample

1st rowGoogle Earth
2nd rowGoogle Earth
3rd rowGPS
4th rowGoogle Earth
5th rowGoogle Earth
ValueCountFrequency (%)
google 7074
41.5%
earth 7074
41.5%
gps 1418
 
8.3%
usgs 530
 
3.1%
topoview 530
 
3.1%
gazetteer 137
 
0.8%
atlas 42
 
0.2%
of 42
 
0.2%
canada 42
 
0.2%
42
 
0.2%
Other values (4) 96
 
0.6%
2025-01-14T11:49:05.286902image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
o 15334
15.5%
G 9159
9.3%
e 8096
8.2%
t 8000
8.1%
7772
7.9%
a 7479
7.6%
r 7294
7.4%
l 7116
7.2%
h 7076
7.2%
g 7074
7.2%
Other values (22) 14326
14.5%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 69543
70.4%
Uppercase Letter 21368
 
21.6%
Space Separator 7772
 
7.9%
Dash Punctuation 42
 
< 0.1%
Other Punctuation 1
 
< 0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
o 15334
22.0%
e 8096
11.6%
t 8000
11.5%
a 7479
10.8%
r 7294
10.5%
l 7116
10.2%
h 7076
10.2%
g 7074
10.2%
p 586
 
0.8%
w 530
 
0.8%
Other values (8) 958
 
1.4%
Uppercase Letter
ValueCountFrequency (%)
G 9159
42.9%
E 7074
33.1%
S 2478
 
11.6%
P 1418
 
6.6%
V 530
 
2.5%
U 530
 
2.5%
A 42
 
0.2%
C 42
 
0.2%
T 42
 
0.2%
I 39
 
0.2%
Space Separator
ValueCountFrequency (%)
7772
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 42
100.0%
Other Punctuation
ValueCountFrequency (%)
. 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 90911
92.1%
Common 7815
 
7.9%

Most frequent character per script

Latin
ValueCountFrequency (%)
o 15334
16.9%
G 9159
10.1%
e 8096
8.9%
t 8000
8.8%
a 7479
8.2%
r 7294
8.0%
l 7116
7.8%
h 7076
7.8%
g 7074
7.8%
E 7074
7.8%
Other values (19) 7209
7.9%
Common
ValueCountFrequency (%)
7772
99.4%
- 42
 
0.5%
. 1
 
< 0.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 98726
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
o 15334
15.5%
G 9159
9.3%
e 8096
8.2%
t 8000
8.1%
7772
7.9%
a 7479
7.6%
r 7294
7.4%
l 7116
7.2%
h 7076
7.2%
g 7074
7.2%
Other values (22) 14326
14.5%

georeferenceRemarks
Text

Missing 

Distinct8
Distinct (%)11.8%
Missing601383
Missing (%)> 99.9%
Memory size4.6 MiB
2025-01-14T11:49:05.350512image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length35
Median length35
Mean length31.20588235
Min length5

Characters and Unicode

Total characters2122
Distinct characters34
Distinct categories5 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique4 ?
Unique (%)5.9%

Sample

1st rowGarmin Etrex Vista HCX, Datum WGS84
2nd rowGarmin Etrex Vista HCX, Datum WGS84
3rd rowGarmin Etrex Vista HCX, Datum WGS84
4th rowGarmin Etrex Vista HCX, Datum WGS84
5th rowGarmin Etrex Vista HCX, Datum WGS84
ValueCountFrequency (%)
garmin 54
15.1%
etrex 54
15.1%
vista 54
15.1%
hcx 54
15.1%
datum 54
15.1%
wgs84 54
15.1%
camp 7
 
2.0%
coordinates 7
 
2.0%
for 6
 
1.7%
longitude 2
 
0.6%
Other values (7) 12
 
3.4%
2025-01-14T11:49:05.474693image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
290
 
13.7%
a 184
 
8.7%
t 175
 
8.2%
r 132
 
6.2%
i 123
 
5.8%
m 118
 
5.6%
G 108
 
5.1%
e 73
 
3.4%
n 67
 
3.2%
s 62
 
2.9%
Other values (24) 790
37.2%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 1118
52.7%
Uppercase Letter 551
26.0%
Space Separator 290
 
13.7%
Decimal Number 108
 
5.1%
Other Punctuation 55
 
2.6%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
a 184
16.5%
t 175
15.7%
r 132
11.8%
i 123
11.0%
m 118
10.6%
e 73
 
6.5%
n 67
 
6.0%
s 62
 
5.5%
u 58
 
5.2%
x 56
 
5.0%
Other values (8) 70
 
6.3%
Uppercase Letter
ValueCountFrequency (%)
G 108
19.6%
C 61
11.1%
S 54
9.8%
W 54
9.8%
D 54
9.8%
X 54
9.8%
H 54
9.8%
V 54
9.8%
E 54
9.8%
L 2
 
0.4%
Decimal Number
ValueCountFrequency (%)
4 54
50.0%
8 54
50.0%
Other Punctuation
ValueCountFrequency (%)
, 54
98.2%
; 1
 
1.8%
Space Separator
ValueCountFrequency (%)
290
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 1669
78.7%
Common 453
 
21.3%

Most frequent character per script

Latin
ValueCountFrequency (%)
a 184
 
11.0%
t 175
 
10.5%
r 132
 
7.9%
i 123
 
7.4%
m 118
 
7.1%
G 108
 
6.5%
e 73
 
4.4%
n 67
 
4.0%
s 62
 
3.7%
C 61
 
3.7%
Other values (19) 566
33.9%
Common
ValueCountFrequency (%)
290
64.0%
4 54
 
11.9%
8 54
 
11.9%
, 54
 
11.9%
; 1
 
0.2%

Most occurring blocks

ValueCountFrequency (%)
ASCII 2122
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
290
 
13.7%
a 184
 
8.7%
t 175
 
8.2%
r 132
 
6.2%
i 123
 
5.8%
m 118
 
5.6%
G 108
 
5.1%
e 73
 
3.4%
n 67
 
3.2%
s 62
 
2.9%
Other values (24) 790
37.2%
Distinct4
Distinct (%)0.3%
Missing599947
Missing (%)99.7%
Memory size4.6 MiB
2025-01-14T11:49:05.525735image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length9
Median length9
Mean length8.412234043
Min length3

Characters and Unicode

Total characters12652
Distinct characters14
Distinct categories4 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowuncertain
2nd rowuncertain
3rd rowuncertain
4th rowuncertain
5th rowcf.
ValueCountFrequency (%)
uncertain 1355
90.0%
cf 147
 
9.8%
sp 2
 
0.1%
near 2
 
0.1%
2025-01-14T11:49:05.636000image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
n 2712
21.4%
c 1502
11.9%
e 1357
10.7%
r 1357
10.7%
a 1357
10.7%
t 1355
10.7%
i 1355
10.7%
u 1315
10.4%
. 149
 
1.2%
f 147
 
1.2%
Other values (4) 46
 
0.4%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 12461
98.5%
Other Punctuation 149
 
1.2%
Uppercase Letter 40
 
0.3%
Space Separator 2
 
< 0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
n 2712
21.8%
c 1502
12.1%
e 1357
10.9%
r 1357
10.9%
a 1357
10.9%
t 1355
10.9%
i 1355
10.9%
u 1315
10.6%
f 147
 
1.2%
s 2
 
< 0.1%
Other Punctuation
ValueCountFrequency (%)
. 149
100.0%
Uppercase Letter
ValueCountFrequency (%)
U 40
100.0%
Space Separator
ValueCountFrequency (%)
2
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 12501
98.8%
Common 151
 
1.2%

Most frequent character per script

Latin
ValueCountFrequency (%)
n 2712
21.7%
c 1502
12.0%
e 1357
10.9%
r 1357
10.9%
a 1357
10.9%
t 1355
10.8%
i 1355
10.8%
u 1315
10.5%
f 147
 
1.2%
U 40
 
0.3%
Other values (2) 4
 
< 0.1%
Common
ValueCountFrequency (%)
. 149
98.7%
2
 
1.3%

Most occurring blocks

ValueCountFrequency (%)
ASCII 12652
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
n 2712
21.4%
c 1502
11.9%
e 1357
10.7%
r 1357
10.7%
a 1357
10.7%
t 1355
10.7%
i 1355
10.7%
u 1315
10.4%
. 149
 
1.2%
f 147
 
1.2%
Other values (4) 46
 
0.4%

typeStatus
Text

Missing 

Distinct10
Distinct (%)0.3%
Missing597685
Missing (%)99.4%
Memory size4.6 MiB
2025-01-14T11:49:05.688287image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length18
Median length4
Mean length4.250929368
Min length4

Characters and Unicode

Total characters16009
Distinct characters20
Distinct categories4 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique3 ?
Unique (%)0.1%

Sample

1st rowLectotype
2nd rowType
3rd rowType
4th rowType
5th rowType
ValueCountFrequency (%)
type 3590
94.5%
syntype 83
 
2.2%
lectotype 68
 
1.8%
renamed 28
 
0.7%
neotype 12
 
0.3%
holotype 12
 
0.3%
nomen 2
 
0.1%
nudem 2
 
0.1%
2025-01-14T11:49:05.802643image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
e 3905
24.4%
y 3848
24.0%
p 3765
23.5%
T 3590
22.4%
t 243
 
1.5%
n 113
 
0.7%
o 106
 
0.7%
S 83
 
0.5%
L 68
 
0.4%
c 68
 
0.4%
Other values (10) 220
 
1.4%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 12152
75.9%
Uppercase Letter 3797
 
23.7%
Space Separator 31
 
0.2%
Other Punctuation 29
 
0.2%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
e 3905
32.1%
y 3848
31.7%
p 3765
31.0%
t 243
 
2.0%
n 113
 
0.9%
o 106
 
0.9%
c 68
 
0.6%
m 32
 
0.3%
d 30
 
0.2%
a 28
 
0.2%
Other values (2) 14
 
0.1%
Uppercase Letter
ValueCountFrequency (%)
T 3590
94.5%
S 83
 
2.2%
L 68
 
1.8%
R 28
 
0.7%
N 16
 
0.4%
H 12
 
0.3%
Space Separator
ValueCountFrequency (%)
31
100.0%
Other Punctuation
ValueCountFrequency (%)
; 29
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 15949
99.6%
Common 60
 
0.4%

Most frequent character per script

Latin
ValueCountFrequency (%)
e 3905
24.5%
y 3848
24.1%
p 3765
23.6%
T 3590
22.5%
t 243
 
1.5%
n 113
 
0.7%
o 106
 
0.7%
S 83
 
0.5%
L 68
 
0.4%
c 68
 
0.4%
Other values (8) 160
 
1.0%
Common
ValueCountFrequency (%)
31
51.7%
; 29
48.3%

Most occurring blocks

ValueCountFrequency (%)
ASCII 16009
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
e 3905
24.4%
y 3848
24.0%
p 3765
23.5%
T 3590
22.4%
t 243
 
1.5%
n 113
 
0.7%
o 106
 
0.7%
S 83
 
0.5%
L 68
 
0.4%
c 68
 
0.4%
Other values (10) 220
 
1.4%

identifiedBy
Text

Missing 

Distinct95
Distinct (%)1.2%
Missing593267
Missing (%)98.6%
Memory size4.6 MiB
2025-01-14T11:49:05.985885image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length132
Median length124
Mean length94.36840176
Min length10

Characters and Unicode

Total characters772311
Distinct characters58
Distinct categories7 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique21 ?
Unique (%)0.3%

Sample

1st rowO'Neill, Jennifer K., Fort Hayes State University
2nd rowGardner, Alfred L., Curator (USGS), United States Geological Survey (UNITED STATES)
3rd rowWoodman, Neal, (USGS), United States Geological Survey (UNITED STATES)
4th rowLunde, Darrin P., Collections Manager (MAM), Smithsonian Institution - National Museum of Natural History (UNITED STATES)
5th rowReeder, DeeAnn M., Bucknell University (UNITED STATES)
ValueCountFrequency (%)
states 8033
 
7.9%
united 8033
 
7.9%
of 5420
 
5.3%
museum 5255
 
5.2%
natural 5077
 
5.0%
history 5077
 
5.0%
national 5064
 
5.0%
smithsonian 5007
 
4.9%
institution 5007
 
4.9%
4859
 
4.8%
Other values (272) 44753
44.1%
2025-01-14T11:49:06.253813image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
93401
 
12.1%
t 49895
 
6.5%
o 47659
 
6.2%
i 45409
 
5.9%
a 41696
 
5.4%
e 39504
 
5.1%
n 38647
 
5.0%
s 36580
 
4.7%
r 29451
 
3.8%
u 25575
 
3.3%
Other values (48) 324494
42.0%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 444828
57.6%
Uppercase Letter 174361
 
22.6%
Space Separator 93401
 
12.1%
Other Punctuation 28613
 
3.7%
Open Punctuation 13070
 
1.7%
Close Punctuation 13070
 
1.7%
Dash Punctuation 4968
 
0.6%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
t 49895
11.2%
o 47659
10.7%
i 45409
10.2%
a 41696
9.4%
e 39504
8.9%
n 38647
8.7%
s 36580
8.2%
r 29451
6.6%
u 25575
 
5.7%
l 25043
 
5.6%
Other values (15) 65369
14.7%
Uppercase Letter
ValueCountFrequency (%)
S 24263
13.9%
T 20751
11.9%
M 20290
11.6%
N 18192
10.4%
E 15013
8.6%
A 13295
7.6%
I 12131
7.0%
U 9985
5.7%
D 9079
 
5.2%
H 8379
 
4.8%
Other values (14) 22983
13.2%
Other Punctuation
ValueCountFrequency (%)
, 22186
77.5%
. 6353
 
22.2%
' 69
 
0.2%
; 4
 
< 0.1%
& 1
 
< 0.1%
Space Separator
ValueCountFrequency (%)
93401
100.0%
Open Punctuation
ValueCountFrequency (%)
( 13070
100.0%
Close Punctuation
ValueCountFrequency (%)
) 13070
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 4968
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 619189
80.2%
Common 153122
 
19.8%

Most frequent character per script

Latin
ValueCountFrequency (%)
t 49895
 
8.1%
o 47659
 
7.7%
i 45409
 
7.3%
a 41696
 
6.7%
e 39504
 
6.4%
n 38647
 
6.2%
s 36580
 
5.9%
r 29451
 
4.8%
u 25575
 
4.1%
l 25043
 
4.0%
Other values (39) 239730
38.7%
Common
ValueCountFrequency (%)
93401
61.0%
, 22186
 
14.5%
( 13070
 
8.5%
) 13070
 
8.5%
. 6353
 
4.1%
- 4968
 
3.2%
' 69
 
< 0.1%
; 4
 
< 0.1%
& 1
 
< 0.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 772311
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
93401
 
12.1%
t 49895
 
6.5%
o 47659
 
6.2%
i 45409
 
5.9%
a 41696
 
5.4%
e 39504
 
5.1%
n 38647
 
5.0%
s 36580
 
4.7%
r 29451
 
3.8%
u 25575
 
3.3%
Other values (48) 324494
42.0%
Distinct7805
Distinct (%)1.3%
Missing0
Missing (%)0.0%
Memory size4.6 MiB
2025-01-14T11:49:06.415505image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length63
Median length43
Mean length22.61255364
Min length5

Characters and Unicode

Total characters13600343
Distinct characters63
Distinct categories7 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique898 ?
Unique (%)0.1%

Sample

1st rowPotos flavus
2nd rowMicrotus longicaudus longicaudus
3rd rowCarollia brevicauda
4th rowPeromyscus mexicanus totontepecus
5th rowTursiops truncatus
ValueCountFrequency (%)
peromyscus 38753
 
2.6%
sp 28343
 
1.9%
rattus 21929
 
1.5%
microtus 19877
 
1.3%
maniculatus 15880
 
1.1%
sorex 15831
 
1.1%
artibeus 12470
 
0.8%
carollia 12281
 
0.8%
tursiops 11895
 
0.8%
truncatus 11875
 
0.8%
Other values (5505) 1302266
87.3%
2025-01-14T11:49:06.649565image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
s 1517215
 
11.2%
i 1187099
 
8.7%
a 1082276
 
8.0%
u 980723
 
7.2%
o 902387
 
6.6%
889949
 
6.5%
e 862255
 
6.3%
r 848292
 
6.2%
n 665623
 
4.9%
l 634731
 
4.7%
Other values (53) 4029793
29.6%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 12079597
88.8%
Space Separator 889949
 
6.5%
Uppercase Letter 601771
 
4.4%
Other Punctuation 28356
 
0.2%
Open Punctuation 313
 
< 0.1%
Close Punctuation 313
 
< 0.1%
Decimal Number 44
 
< 0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
s 1517215
12.6%
i 1187099
9.8%
a 1082276
 
9.0%
u 980723
 
8.1%
o 902387
 
7.5%
e 862255
 
7.1%
r 848292
 
7.0%
n 665623
 
5.5%
l 634731
 
5.3%
t 618435
 
5.1%
Other values (16) 2780561
23.0%
Uppercase Letter
ValueCountFrequency (%)
M 103156
17.1%
P 84557
14.1%
C 58907
9.8%
S 54594
9.1%
T 51645
8.6%
A 32571
 
5.4%
R 31119
 
5.2%
G 28180
 
4.7%
L 23175
 
3.9%
N 23069
 
3.8%
Other values (14) 110798
18.4%
Decimal Number
ValueCountFrequency (%)
8 13
29.5%
1 12
27.3%
2 7
15.9%
9 6
13.6%
5 3
 
6.8%
0 2
 
4.5%
4 1
 
2.3%
Other Punctuation
ValueCountFrequency (%)
. 28343
> 99.9%
, 11
 
< 0.1%
/ 2
 
< 0.1%
Space Separator
ValueCountFrequency (%)
889949
100.0%
Open Punctuation
ValueCountFrequency (%)
( 313
100.0%
Close Punctuation
ValueCountFrequency (%)
) 313
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 12681368
93.2%
Common 918975
 
6.8%

Most frequent character per script

Latin
ValueCountFrequency (%)
s 1517215
12.0%
i 1187099
 
9.4%
a 1082276
 
8.5%
u 980723
 
7.7%
o 902387
 
7.1%
e 862255
 
6.8%
r 848292
 
6.7%
n 665623
 
5.2%
l 634731
 
5.0%
t 618435
 
4.9%
Other values (40) 3382332
26.7%
Common
ValueCountFrequency (%)
889949
96.8%
. 28343
 
3.1%
( 313
 
< 0.1%
) 313
 
< 0.1%
8 13
 
< 0.1%
1 12
 
< 0.1%
, 11
 
< 0.1%
2 7
 
< 0.1%
9 6
 
< 0.1%
5 3
 
< 0.1%
Other values (3) 5
 
< 0.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 13600343
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
s 1517215
 
11.2%
i 1187099
 
8.7%
a 1082276
 
8.0%
u 980723
 
7.2%
o 902387
 
6.6%
889949
 
6.5%
e 862255
 
6.3%
r 848292
 
6.2%
n 665623
 
4.9%
l 634731
 
4.7%
Other values (53) 4029793
29.6%
Distinct253
Distinct (%)< 0.1%
Missing7
Missing (%)< 0.1%
Memory size4.6 MiB
2025-01-14T11:49:06.805711image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length121
Median length113
Mean length90.64064651
Min length11

Characters and Unicode

Total characters54515273
Distinct characters48
Distinct categories4 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique10 ?
Unique (%)< 0.1%

Sample

1st rowAnimalia, Chordata, Vertebrata, Mammalia, Eutheria, Carnivora, Caniformia, Procyonidae
2nd rowAnimalia, Chordata, Vertebrata, Mammalia, Eutheria, Rodentia, Myomorpha, Cricetidae, Arvicolinae
3rd rowAnimalia, Chordata, Vertebrata, Mammalia, Eutheria, Chiroptera, Phyllostomidae, Carolliinae
4th rowAnimalia, Chordata, Vertebrata, Mammalia, Eutheria, Rodentia, Myomorpha, Cricetidae, Neotominae
5th rowAnimalia, Chordata, Vertebrata, Mammalia, Eutheria, Cetacea, Odontoceti, Delphinidae
ValueCountFrequency (%)
animalia 601442
11.9%
vertebrata 601442
11.9%
chordata 601442
11.9%
mammalia 601441
11.9%
eutheria 593341
11.7%
rodentia 297636
 
5.9%
myomorpha 209417
 
4.1%
chiroptera 129086
 
2.5%
cricetidae 107243
 
2.1%
muridae 93911
 
1.9%
Other values (328) 1234181
24.3%
2025-01-14T11:49:07.038131image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
a 8383067
15.4%
i 4797600
 
8.8%
, 4469138
 
8.2%
4469138
 
8.2%
e 4068524
 
7.5%
r 4037606
 
7.4%
t 3533330
 
6.5%
o 2704288
 
5.0%
m 2453478
 
4.5%
h 1861673
 
3.4%
Other values (38) 13737431
25.2%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 40506415
74.3%
Uppercase Letter 5070582
 
9.3%
Other Punctuation 4469138
 
8.2%
Space Separator 4469138
 
8.2%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
a 8383067
20.7%
i 4797600
11.8%
e 4068524
10.0%
r 4037606
10.0%
t 3533330
8.7%
o 2704288
 
6.7%
m 2453478
 
6.1%
h 1861673
 
4.6%
n 1678993
 
4.1%
l 1675363
 
4.1%
Other values (14) 5312493
13.1%
Uppercase Letter
ValueCountFrequency (%)
C 1067853
21.1%
M 1065447
21.0%
A 654487
12.9%
V 641586
12.7%
E 615945
12.1%
R 302881
 
6.0%
S 237180
 
4.7%
P 112443
 
2.2%
D 65158
 
1.3%
N 62146
 
1.2%
Other values (12) 245456
 
4.8%
Other Punctuation
ValueCountFrequency (%)
, 4469138
100.0%
Space Separator
ValueCountFrequency (%)
4469138
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 45576997
83.6%
Common 8938276
 
16.4%

Most frequent character per script

Latin
ValueCountFrequency (%)
a 8383067
18.4%
i 4797600
10.5%
e 4068524
 
8.9%
r 4037606
 
8.9%
t 3533330
 
7.8%
o 2704288
 
5.9%
m 2453478
 
5.4%
h 1861673
 
4.1%
n 1678993
 
3.7%
l 1675363
 
3.7%
Other values (36) 10383075
22.8%
Common
ValueCountFrequency (%)
, 4469138
50.0%
4469138
50.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 54515273
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
a 8383067
15.4%
i 4797600
 
8.8%
, 4469138
 
8.2%
4469138
 
8.2%
e 4068524
 
7.5%
r 4037606
 
7.4%
t 3533330
 
6.5%
o 2704288
 
5.0%
m 2453478
 
4.5%
h 1861673
 
3.4%
Other values (38) 13737431
25.2%

kingdom
Text

Constant 

Distinct1
Distinct (%)< 0.1%
Missing9
Missing (%)< 0.1%
Memory size4.6 MiB
2025-01-14T11:49:07.091848image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length8
Median length8
Mean length8
Min length8

Characters and Unicode

Total characters4811536
Distinct characters6
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowAnimalia
2nd rowAnimalia
3rd rowAnimalia
4th rowAnimalia
5th rowAnimalia
ValueCountFrequency (%)
animalia 601442
100.0%
2025-01-14T11:49:07.191913image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
i 1202884
25.0%
a 1202884
25.0%
A 601442
12.5%
n 601442
12.5%
m 601442
12.5%
l 601442
12.5%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 4210094
87.5%
Uppercase Letter 601442
 
12.5%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
i 1202884
28.6%
a 1202884
28.6%
n 601442
14.3%
m 601442
14.3%
l 601442
14.3%
Uppercase Letter
ValueCountFrequency (%)
A 601442
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 4811536
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
i 1202884
25.0%
a 1202884
25.0%
A 601442
12.5%
n 601442
12.5%
m 601442
12.5%
l 601442
12.5%

Most occurring blocks

ValueCountFrequency (%)
ASCII 4811536
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
i 1202884
25.0%
a 1202884
25.0%
A 601442
12.5%
n 601442
12.5%
m 601442
12.5%
l 601442
12.5%

phylum
Text

Constant 

Distinct1
Distinct (%)< 0.1%
Missing9
Missing (%)< 0.1%
Memory size4.6 MiB
2025-01-14T11:49:07.236953image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length8
Median length8
Mean length8
Min length8

Characters and Unicode

Total characters4811536
Distinct characters7
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowChordata
2nd rowChordata
3rd rowChordata
4th rowChordata
5th rowChordata
ValueCountFrequency (%)
chordata 601442
100.0%
2025-01-14T11:49:07.334401image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
a 1202884
25.0%
C 601442
12.5%
h 601442
12.5%
o 601442
12.5%
r 601442
12.5%
d 601442
12.5%
t 601442
12.5%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 4210094
87.5%
Uppercase Letter 601442
 
12.5%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
a 1202884
28.6%
h 601442
14.3%
o 601442
14.3%
r 601442
14.3%
d 601442
14.3%
t 601442
14.3%
Uppercase Letter
ValueCountFrequency (%)
C 601442
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 4811536
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
a 1202884
25.0%
C 601442
12.5%
h 601442
12.5%
o 601442
12.5%
r 601442
12.5%
d 601442
12.5%
t 601442
12.5%

Most occurring blocks

ValueCountFrequency (%)
ASCII 4811536
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
a 1202884
25.0%
C 601442
12.5%
h 601442
12.5%
o 601442
12.5%
r 601442
12.5%
d 601442
12.5%
t 601442
12.5%

class
Text

Constant 

Distinct1
Distinct (%)< 0.1%
Missing10
Missing (%)< 0.1%
Memory size4.6 MiB
2025-01-14T11:49:07.377620image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length8
Median length8
Mean length8
Min length8

Characters and Unicode

Total characters4811528
Distinct characters5
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowMammalia
2nd rowMammalia
3rd rowMammalia
4th rowMammalia
5th rowMammalia
ValueCountFrequency (%)
mammalia 601441
100.0%
2025-01-14T11:49:07.475965image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
a 1804323
37.5%
m 1202882
25.0%
M 601441
 
12.5%
l 601441
 
12.5%
i 601441
 
12.5%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 4210087
87.5%
Uppercase Letter 601441
 
12.5%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
a 1804323
42.9%
m 1202882
28.6%
l 601441
 
14.3%
i 601441
 
14.3%
Uppercase Letter
ValueCountFrequency (%)
M 601441
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 4811528
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
a 1804323
37.5%
m 1202882
25.0%
M 601441
 
12.5%
l 601441
 
12.5%
i 601441
 
12.5%

Most occurring blocks

ValueCountFrequency (%)
ASCII 4811528
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
a 1804323
37.5%
m 1202882
25.0%
M 601441
 
12.5%
l 601441
 
12.5%
i 601441
 
12.5%

order
Text

Distinct29
Distinct (%)< 0.1%
Missing10
Missing (%)< 0.1%
Memory size4.6 MiB
2025-01-14T11:49:07.536384image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length16
Median length8
Mean length8.868953064
Min length6

Characters and Unicode

Total characters5334152
Distinct characters32
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowCarnivora
2nd rowRodentia
3rd rowChiroptera
4th rowRodentia
5th rowCetacea
ValueCountFrequency (%)
rodentia 297636
49.5%
chiroptera 129086
21.5%
cetacea 47582
 
7.9%
carnivora 47293
 
7.9%
soricomorpha 30383
 
5.1%
lagomorpha 11977
 
2.0%
artiodactyla 11375
 
1.9%
primates 10781
 
1.8%
didelphimorphia 5643
 
0.9%
diprotodontia 1652
 
0.3%
Other values (19) 8033
 
1.3%
2025-01-14T11:49:07.661854image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
a 725642
13.6%
o 618973
11.6%
i 555649
10.4%
e 546091
10.2%
t 514232
9.6%
r 462546
8.7%
n 351517
6.6%
d 320912
6.0%
R 297636
5.6%
C 224380
 
4.2%
Other values (22) 716574
13.4%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 4732711
88.7%
Uppercase Letter 601441
 
11.3%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
a 725642
15.3%
o 618973
13.1%
i 555649
11.7%
e 546091
11.5%
t 514232
10.9%
r 462546
9.8%
n 351517
7.4%
d 320912
6.8%
p 186076
 
3.9%
h 184413
 
3.9%
Other values (10) 266660
 
5.6%
Uppercase Letter
ValueCountFrequency (%)
R 297636
49.5%
C 224380
37.3%
S 32307
 
5.4%
P 12509
 
2.1%
A 12049
 
2.0%
L 11977
 
2.0%
D 7784
 
1.3%
M 1503
 
0.2%
E 940
 
0.2%
H 341
 
0.1%
Other values (2) 15
 
< 0.1%

Most occurring scripts

ValueCountFrequency (%)
Latin 5334152
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
a 725642
13.6%
o 618973
11.6%
i 555649
10.4%
e 546091
10.2%
t 514232
9.6%
r 462546
8.7%
n 351517
6.6%
d 320912
6.0%
R 297636
5.6%
C 224380
 
4.2%
Other values (22) 716574
13.4%

Most occurring blocks

ValueCountFrequency (%)
ASCII 5334152
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
a 725642
13.6%
o 618973
11.6%
i 555649
10.4%
e 546091
10.2%
t 514232
9.6%
r 462546
8.7%
n 351517
6.6%
d 320912
6.0%
R 297636
5.6%
C 224380
 
4.2%
Other values (22) 716574
13.4%

family
Text

Distinct153
Distinct (%)< 0.1%
Missing7
Missing (%)< 0.1%
Memory size4.6 MiB
2025-01-14T11:49:07.812617image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length19
Median length16
Mean length10.23417143
Min length6

Characters and Unicode

Total characters6155281
Distinct characters42
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique4 ?
Unique (%)< 0.1%

Sample

1st rowProcyonidae
2nd rowCricetidae
3rd rowPhyllostomidae
4th rowCricetidae
5th rowDelphinidae
ValueCountFrequency (%)
cricetidae 107243
17.8%
muridae 93911
15.6%
phyllostomidae 55530
 
9.2%
sciuridae 46130
 
7.7%
soricidae 27470
 
4.6%
vespertilionidae 25753
 
4.3%
delphinidae 23642
 
3.9%
heteromyidae 19997
 
3.3%
molossidae 13560
 
2.3%
canidae 12559
 
2.1%
Other values (143) 175649
29.2%
2025-01-14T11:49:08.040320image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
e 949239
15.4%
i 924898
15.0%
a 664049
10.8%
d 634246
10.3%
r 409806
 
6.7%
o 348989
 
5.7%
t 274918
 
4.5%
l 232701
 
3.8%
c 221432
 
3.6%
u 159979
 
2.6%
Other values (32) 1335024
21.7%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 5553837
90.2%
Uppercase Letter 601444
 
9.8%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
e 949239
17.1%
i 924898
16.7%
a 664049
12.0%
d 634246
11.4%
r 409806
7.4%
o 348989
 
6.3%
t 274918
 
5.0%
l 232701
 
4.2%
c 221432
 
4.0%
u 159979
 
2.9%
Other values (12) 733580
13.2%
Uppercase Letter
ValueCountFrequency (%)
C 134035
22.3%
M 130155
21.6%
P 81969
13.6%
S 74875
12.4%
D 35147
 
5.8%
V 26962
 
4.5%
H 26673
 
4.4%
B 14305
 
2.4%
G 12230
 
2.0%
E 11823
 
2.0%
Other values (10) 53270
 
8.9%

Most occurring scripts

ValueCountFrequency (%)
Latin 6155281
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
e 949239
15.4%
i 924898
15.0%
a 664049
10.8%
d 634246
10.3%
r 409806
 
6.7%
o 348989
 
5.7%
t 274918
 
4.5%
l 232701
 
3.8%
c 221432
 
3.6%
u 159979
 
2.6%
Other values (32) 1335024
21.7%

Most occurring blocks

ValueCountFrequency (%)
ASCII 6155281
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
e 949239
15.4%
i 924898
15.0%
a 664049
10.8%
d 634246
10.3%
r 409806
 
6.7%
o 348989
 
5.7%
t 274918
 
4.5%
l 232701
 
3.8%
c 221432
 
3.6%
u 159979
 
2.6%
Other values (32) 1335024
21.7%

genus
Text

Distinct1136
Distinct (%)0.2%
Missing16
Missing (%)< 0.1%
Memory size4.6 MiB
2025-01-14T11:49:08.243651image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length17
Median length15
Mean length8.505181774
Min length2

Characters and Unicode

Total characters5115314
Distinct characters50
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique72 ?
Unique (%)< 0.1%

Sample

1st rowPotos
2nd rowMicrotus
3rd rowCarollia
4th rowPeromyscus
5th rowTursiops
ValueCountFrequency (%)
peromyscus 38753
 
6.4%
microtus 19877
 
3.3%
rattus 16463
 
2.7%
sorex 15826
 
2.6%
artibeus 12470
 
2.1%
carollia 12281
 
2.0%
tursiops 11894
 
2.0%
tamias 11871
 
2.0%
mastomys 11447
 
1.9%
mus 10554
 
1.8%
Other values (1126) 439999
73.2%
2025-01-14T11:49:08.517606image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
s 603660
 
11.8%
o 513897
 
10.0%
a 349432
 
6.8%
r 348676
 
6.8%
u 336000
 
6.6%
i 331710
 
6.5%
e 315840
 
6.2%
t 247363
 
4.8%
l 220400
 
4.3%
m 215951
 
4.2%
Other values (40) 1632385
31.9%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 4513879
88.2%
Uppercase Letter 601435
 
11.8%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
s 603660
13.4%
o 513897
11.4%
a 349432
 
7.7%
r 348676
 
7.7%
u 336000
 
7.4%
i 331710
 
7.3%
e 315840
 
7.0%
t 247363
 
5.5%
l 220400
 
4.9%
m 215951
 
4.8%
Other values (16) 1030950
22.8%
Uppercase Letter
ValueCountFrequency (%)
M 102970
17.1%
P 84552
14.1%
C 58805
9.8%
S 54592
9.1%
T 51642
8.6%
A 32571
 
5.4%
R 31119
 
5.2%
G 28178
 
4.7%
L 23171
 
3.9%
N 23063
 
3.8%
Other values (14) 110772
18.4%

Most occurring scripts

ValueCountFrequency (%)
Latin 5115314
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
s 603660
 
11.8%
o 513897
 
10.0%
a 349432
 
6.8%
r 348676
 
6.8%
u 336000
 
6.6%
i 331710
 
6.5%
e 315840
 
6.2%
t 247363
 
4.8%
l 220400
 
4.3%
m 215951
 
4.2%
Other values (40) 1632385
31.9%

Most occurring blocks

ValueCountFrequency (%)
ASCII 5115314
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
s 603660
 
11.8%
o 513897
 
10.0%
a 349432
 
6.8%
r 348676
 
6.8%
u 336000
 
6.6%
i 331710
 
6.5%
e 315840
 
6.2%
t 247363
 
4.8%
l 220400
 
4.3%
m 215951
 
4.2%
Other values (40) 1632385
31.9%

subgenus
Text

Missing 

Distinct3
Distinct (%)1.0%
Missing601149
Missing (%)99.9%
Memory size4.6 MiB
2025-01-14T11:49:08.580340image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length12
Median length12
Mean length10.82781457
Min length9

Characters and Unicode

Total characters3270
Distinct characters15
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowMallodelphys
2nd rowEumarmosa
3rd rowCaluromys
4th rowCaluromys
5th rowEumarmosa
ValueCountFrequency (%)
mallodelphys 184
60.9%
caluromys 95
31.5%
eumarmosa 23
 
7.6%
2025-01-14T11:49:08.689444image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
l 647
19.8%
a 325
9.9%
o 302
9.2%
s 302
9.2%
y 279
8.5%
M 184
 
5.6%
d 184
 
5.6%
e 184
 
5.6%
p 184
 
5.6%
h 184
 
5.6%
Other values (5) 495
15.1%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 2968
90.8%
Uppercase Letter 302
 
9.2%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
l 647
21.8%
a 325
11.0%
o 302
10.2%
s 302
10.2%
y 279
9.4%
d 184
 
6.2%
e 184
 
6.2%
p 184
 
6.2%
h 184
 
6.2%
m 141
 
4.8%
Other values (2) 236
 
8.0%
Uppercase Letter
ValueCountFrequency (%)
M 184
60.9%
C 95
31.5%
E 23
 
7.6%

Most occurring scripts

ValueCountFrequency (%)
Latin 3270
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
l 647
19.8%
a 325
9.9%
o 302
9.2%
s 302
9.2%
y 279
8.5%
M 184
 
5.6%
d 184
 
5.6%
e 184
 
5.6%
p 184
 
5.6%
h 184
 
5.6%
Other values (5) 495
15.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 3270
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
l 647
19.8%
a 325
9.9%
o 302
9.2%
s 302
9.2%
y 279
8.5%
M 184
 
5.6%
d 184
 
5.6%
e 184
 
5.6%
p 184
 
5.6%
h 184
 
5.6%
Other values (5) 495
15.1%
Distinct2774
Distinct (%)0.5%
Missing678
Missing (%)0.1%
Memory size4.6 MiB
2025-01-14T11:49:08.881416image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length19
Median length14
Mean length8.402236785
Min length2

Characters and Unicode

Total characters5047837
Distinct characters29
Distinct categories3 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique260 ?
Unique (%)< 0.1%

Sample

1st rowflavus
2nd rowlongicaudus
3rd rowbrevicauda
4th rowmexicanus
5th rowtruncatus
ValueCountFrequency (%)
sp 28335
 
4.7%
maniculatus 15647
 
2.6%
truncatus 11873
 
2.0%
musculus 8553
 
1.4%
perspicillata 8339
 
1.4%
leucopus 7382
 
1.2%
brevicauda 7356
 
1.2%
pennsylvanicus 6840
 
1.1%
jamaicensis 5581
 
0.9%
rattus 5466
 
0.9%
Other values (2764) 495405
82.5%
2025-01-14T11:49:09.150049image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
s 600065
11.9%
i 553121
11.0%
a 505496
10.0%
u 460127
9.1%
e 329136
 
6.5%
r 328084
 
6.5%
n 326046
 
6.5%
l 286917
 
5.7%
t 270410
 
5.4%
c 259844
 
5.1%
Other values (19) 1128591
22.4%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 5019496
99.4%
Other Punctuation 28337
 
0.6%
Space Separator 4
 
< 0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
s 600065
12.0%
i 553121
11.0%
a 505496
10.1%
u 460127
9.2%
e 329136
 
6.6%
r 328084
 
6.5%
n 326046
 
6.5%
l 286917
 
5.7%
t 270410
 
5.4%
c 259844
 
5.2%
Other values (16) 1100250
21.9%
Other Punctuation
ValueCountFrequency (%)
. 28335
> 99.9%
/ 2
 
< 0.1%
Space Separator
ValueCountFrequency (%)
4
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 5019496
99.4%
Common 28341
 
0.6%

Most frequent character per script

Latin
ValueCountFrequency (%)
s 600065
12.0%
i 553121
11.0%
a 505496
10.1%
u 460127
9.2%
e 329136
 
6.6%
r 328084
 
6.5%
n 326046
 
6.5%
l 286917
 
5.7%
t 270410
 
5.4%
c 259844
 
5.2%
Other values (16) 1100250
21.9%
Common
ValueCountFrequency (%)
. 28335
> 99.9%
4
 
< 0.1%
/ 2
 
< 0.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 5047837
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
s 600065
11.9%
i 553121
11.0%
a 505496
10.0%
u 460127
9.1%
e 329136
 
6.5%
r 328084
 
6.5%
n 326046
 
6.5%
l 286917
 
5.7%
t 270410
 
5.4%
c 259844
 
5.1%
Other values (19) 1128591
22.4%

infraspecificEpithet
Text

Missing 

Distinct2646
Distinct (%)0.9%
Missing314922
Missing (%)52.4%
Memory size4.6 MiB
2025-01-14T11:49:09.342974image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length19
Median length15
Mean length8.827504371
Min length3

Characters and Unicode

Total characters2529336
Distinct characters26
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique227 ?
Unique (%)0.1%

Sample

1st rowlongicaudus
2nd rowtotontepecus
3rd rowmarinensis
4th rowbairdii
5th rowmerriami
ValueCountFrequency (%)
noveboracensis 4836
 
1.7%
domesticus 4357
 
1.5%
pennsylvanicus 4127
 
1.4%
talpoides 3712
 
1.3%
cinereus 3602
 
1.3%
sonoriensis 2279
 
0.8%
gambelii 2247
 
0.8%
trowbridgii 2145
 
0.7%
merriami 2101
 
0.7%
longicaudus 2081
 
0.7%
Other values (2636) 255042
89.0%
2025-01-14T11:49:09.599614image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
s 311070
12.3%
i 300653
11.9%
a 225909
8.9%
e 216894
8.6%
n 194319
 
7.7%
u 182648
 
7.2%
r 170773
 
6.8%
o 145249
 
5.7%
l 125981
 
5.0%
c 119143
 
4.7%
Other values (16) 536697
21.2%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 2529336
100.0%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
s 311070
12.3%
i 300653
11.9%
a 225909
8.9%
e 216894
8.6%
n 194319
 
7.7%
u 182648
 
7.2%
r 170773
 
6.8%
o 145249
 
5.7%
l 125981
 
5.0%
c 119143
 
4.7%
Other values (16) 536697
21.2%

Most occurring scripts

ValueCountFrequency (%)
Latin 2529336
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
s 311070
12.3%
i 300653
11.9%
a 225909
8.9%
e 216894
8.6%
n 194319
 
7.7%
u 182648
 
7.2%
r 170773
 
6.8%
o 145249
 
5.7%
l 125981
 
5.0%
c 119143
 
4.7%
Other values (16) 536697
21.2%

Most occurring blocks

ValueCountFrequency (%)
ASCII 2529336
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
s 311070
12.3%
i 300653
11.9%
a 225909
8.9%
e 216894
8.6%
n 194319
 
7.7%
u 182648
 
7.2%
r 170773
 
6.8%
o 145249
 
5.7%
l 125981
 
5.0%
c 119143
 
4.7%
Other values (16) 536697
21.2%

taxonRank
Text

Constant  Missing 

Distinct1
Distinct (%)< 0.1%
Missing314922
Missing (%)52.4%
Memory size4.6 MiB
2025-01-14T11:49:09.655899image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length10
Median length10
Mean length10
Min length10

Characters and Unicode

Total characters2865290
Distinct characters7
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowsubspecies
2nd rowsubspecies
3rd rowsubspecies
4th rowsubspecies
5th rowsubspecies
ValueCountFrequency (%)
subspecies 286529
100.0%
2025-01-14T11:49:09.755826image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
s 859587
30.0%
e 573058
20.0%
u 286529
 
10.0%
b 286529
 
10.0%
p 286529
 
10.0%
c 286529
 
10.0%
i 286529
 
10.0%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 2865290
100.0%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
s 859587
30.0%
e 573058
20.0%
u 286529
 
10.0%
b 286529
 
10.0%
p 286529
 
10.0%
c 286529
 
10.0%
i 286529
 
10.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 2865290
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
s 859587
30.0%
e 573058
20.0%
u 286529
 
10.0%
b 286529
 
10.0%
p 286529
 
10.0%
c 286529
 
10.0%
i 286529
 
10.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 2865290
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
s 859587
30.0%
e 573058
20.0%
u 286529
 
10.0%
b 286529
 
10.0%
p 286529
 
10.0%
c 286529
 
10.0%
i 286529
 
10.0%
Distinct176
Distinct (%)0.4%
Missing555607
Missing (%)92.4%
Memory size4.6 MiB
2025-01-14T11:49:09.863027image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length27
Median length23
Mean length8.940755606
Min length4

Characters and Unicode

Total characters409880
Distinct characters55
Distinct categories7 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique69 ?
Unique (%)0.2%

Sample

1st row(Montagu)
2nd row(Montagu)
3rd row(Linnaeus)
4th row(Cuvier)
5th rowStejneger
ValueCountFrequency (%)
linnaeus 14516
29.5%
montagu 11845
24.1%
gray 4024
 
8.2%
cuvier 2015
 
4.1%
de 1265
 
2.6%
blainville 1263
 
2.6%
traill 1101
 
2.2%
true 1018
 
2.1%
lacepede 1006
 
2.0%
lilljeborg 934
 
1.9%
Other values (159) 10239
20.8%
2025-01-14T11:49:10.039874image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
n 46261
11.3%
( 37392
 
9.1%
) 37392
 
9.1%
a 37146
 
9.1%
e 31946
 
7.8%
u 30372
 
7.4%
i 24799
 
6.1%
s 19879
 
4.8%
L 17196
 
4.2%
o 17121
 
4.2%
Other values (45) 110376
26.9%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 283189
69.1%
Uppercase Letter 47324
 
11.5%
Open Punctuation 37392
 
9.1%
Close Punctuation 37392
 
9.1%
Space Separator 3382
 
0.8%
Other Punctuation 1198
 
0.3%
Dash Punctuation 3
 
< 0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
n 46261
16.3%
a 37146
13.1%
e 31946
11.3%
u 30372
10.7%
i 24799
8.8%
s 19879
7.0%
o 17121
 
6.0%
r 14853
 
5.2%
g 13602
 
4.8%
t 13339
 
4.7%
Other values (16) 33871
12.0%
Uppercase Letter
ValueCountFrequency (%)
L 17196
36.3%
M 13479
28.5%
G 5210
 
11.0%
B 2551
 
5.4%
T 2132
 
4.5%
C 2021
 
4.3%
O 1040
 
2.2%
P 767
 
1.6%
F 755
 
1.6%
D 616
 
1.3%
Other values (12) 1557
 
3.3%
Other Punctuation
ValueCountFrequency (%)
& 683
57.0%
' 426
35.6%
. 89
 
7.4%
Open Punctuation
ValueCountFrequency (%)
( 37392
100.0%
Close Punctuation
ValueCountFrequency (%)
) 37392
100.0%
Space Separator
ValueCountFrequency (%)
3382
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 3
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 330513
80.6%
Common 79367
 
19.4%

Most frequent character per script

Latin
ValueCountFrequency (%)
n 46261
14.0%
a 37146
11.2%
e 31946
9.7%
u 30372
 
9.2%
i 24799
 
7.5%
s 19879
 
6.0%
L 17196
 
5.2%
o 17121
 
5.2%
r 14853
 
4.5%
g 13602
 
4.1%
Other values (38) 77338
23.4%
Common
ValueCountFrequency (%)
( 37392
47.1%
) 37392
47.1%
3382
 
4.3%
& 683
 
0.9%
' 426
 
0.5%
. 89
 
0.1%
- 3
 
< 0.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 409880
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
n 46261
11.3%
( 37392
 
9.1%
) 37392
 
9.1%
a 37146
 
9.1%
e 31946
 
7.8%
u 30372
 
7.4%
i 24799
 
6.1%
s 19879
 
4.8%
L 17196
 
4.2%
o 17121
 
4.2%
Other values (45) 110376
26.9%